[amdgpu] Flickering graphics corruption (but none observed in kernels 4.18.10-4.18.12)

Bug #1813701 reported by Ivan Grigoryev on 2019-01-29
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
xserver-xorg-video-amdgpu (Ubuntu)
Undecided
Unassigned

Bug Description

The screen always flickers at a frequency of 75 GHz on the AMD RX 580 gpu. At 60 and 50 GHz, if you switch to any other than 75, the flickering will disappear until the display turns off. After switching on, flicker returns. At the core of 4.18.0-10, just like on the AMD Radeon HD7970 gpu, the problem is not visible. More in the video
https://www.youtube.com/watch?v=d_LXHqWKTbk

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: xorg 1:7.7+19ubuntu8
ProcVersionSignature: Ubuntu 4.18.0-13.14-generic 4.18.17
Uname: Linux 4.18.0-13-generic x86_64
ApportVersion: 2.20.10-0ubuntu13.1
Architecture: amd64
BootLog: Error: [Errno 13] Отказано в доступе: '/var/log/boot.log'
CompositorRunning: None
CurrentDesktop: ubuntu:GNOME
Date: Tue Jan 29 13:13:19 2019
DistUpgraded: Fresh install
DistroCodename: cosmic
DistroVariant: ubuntu
GraphicsCard:
 Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7) (prog-if 00 [VGA controller])
   Subsystem: ASUSTeK Computer Inc. Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1043:0519]
InstallationDate: Installed on 2019-01-22 (6 days ago)
InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.3)
MachineType: MSI MS-7693
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=ru_RU.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-13-generic root=UUID=ad22bc6d-c4e8-4393-a30b-6c247159b9dd ro quiet splash vt.handoff=1
SourcePackage: xorg
Symptom: display
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/08/2016
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V10.6
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: 970A-G43 (MS-7693)
dmi.board.vendor: MSI
dmi.board.version: 3.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: MSI
dmi.chassis.version: 3.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV10.6:bd01/08/2016:svnMSI:pnMS-7693:pvr3.0:rvnMSI:rn970A-G43(MS-7693):rvr3.0:cvnMSI:ct3:cvr3.0:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: MS-7693
dmi.product.sku: To be filled by O.E.M.
dmi.product.version: 3.0
dmi.sys.vendor: MSI
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.97+git1901221830.b7a7a9~oibaf~c
version.libgl1-mesa-dri: libgl1-mesa-dri 19.0~git1901280730.d1d2bb~oibaf~c
version.libgl1-mesa-glx: libgl1-mesa-glx N/A
version.xserver-xorg-core: xserver-xorg-core 2:1.20.1-3ubuntu2.1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:18.1.0-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20171229-1ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.15-3

Ivan Grigoryev (ivvanvg) wrote :
description: updated
Ivan Grigoryev (ivvanvg) on 2019-01-29
summary: - the screen flickers
+ the screen flickers. Ubuntu 18 + AMD RX580
summary: - the screen flickers. Ubuntu 18 + AMD RX580
+ [amdgpu] Flickering graphics corruption in kernel 4.18.0-13.14-generic
+ 4.18.17 (but none observed in kernel 4.18.0-10)
affects: xorg (Ubuntu) → xserver-xorg-video-amdgpu (Ubuntu)
Ivan Grigoryev (ivvanvg) on 2019-01-29
description: updated

It looks like a problem with the Xorg graphics driver:

[ 44.332] (EE) amdgpu: module ABI major version (23) doesn't match the server's version (24)
[ 44.332] (EE) Failed to load module "amdgpu" (module requirement mismatch, 0)

so your system then falls back to using the 'modeset' Xorg driver with the 'amdgpu' kernel driver.

tags: added: amdgpu
Daniel van Vugt (vanvugt) wrote :

Oh, it appears you have at least one unsupported PPA installed that has replaced some graphics packages:

version.libdrm2: libdrm2 2.4.97+git1901221830.b7a7a9~oibaf~c
version.libgl1-mesa-dri: libgl1-mesa-dri 19.0~git1901280730.d1d2bb~oibaf~c

Please:

1. Remove that PPA from the system using 'ppa-purge'.

2. Reboot.

3. If the problem still occurs then run:

   dpkg -l > allpackages,txt

   and send us the file 'allpackages.txt'.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Incomplete
Ivan Grigoryev (ivvanvg) wrote :

To get rid of the problem, I tried to install the amdgpu-pro driver 18.50, but it is supported only in ubuntu 18.04. Then I installed the free driver https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers. but unfortunately the problem is not solved.

description: updated
Daniel van Vugt (vanvugt) wrote :

Yes, please remove both of those from your system and follow the steps in comment #3. The drivers you installed seem to be causing problems so we can't really see the original problem right now.

Daniel van Vugt (vanvugt) wrote :

The correct "free" drivers come built into Ubuntu automatically.

If you install unofficial drivers like "https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers" then we cannot help you.

Ivan Grigoryev (ivvanvg) wrote :

third-party PPA removed, packages updated

Ivan Grigoryev (ivvanvg) wrote :

This problem is detected immediately after installing the system. I tried to install third-party drivers to fix the problem, but it did not help. Now all removed

Daniel van Vugt (vanvugt) wrote :

Thanks.

Next please:

1. Send us copies of /var/log/Xorg*.log

2. Run: dmesg > dmesg.txt
   and send the file 'dmesg.txt'.

3. Run: lspci -k > lspcik.txt
   and send the file 'lspcik.txt'

4. Run: journalctl -b0 > journal.txt
   and send the file 'journal.txt'.

Ivan Grigoryev (ivvanvg) wrote :
Ivan Grigoryev (ivvanvg) wrote :
Ivan Grigoryev (ivvanvg) wrote :
Ivan Grigoryev (ivvanvg) wrote :
Ivan Grigoryev (ivvanvg) on 2019-01-29
description: updated
Ivan Grigoryev (ivvanvg) wrote :

I think this problem is identical to mine.
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-amdgpu/+bug/1802976
I also found some more videos confirming the problem on Ubuntu 18.04
https://www.youtube.com/watch?v=jYdoA7wTuz4
https://www.youtube.com/watch?v=BqBc9kZvjHQ

Daniel van Vugt (vanvugt) wrote :

Sounds like we _could_ make this a duplicate of bug 1802976. But I wouldn't just yet...

If you are sure that reverting to an older kernel like 4.18.0-10 fixes the problem then we should focus on that here.

Can you please find the *latest* older kernel version that fixes the problem from this list?

https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D

Ivan Grigoryev (ivvanvg) wrote :

what packages do I need to install? for example for kernel version 4.18.16
linux-headers-4.18.16-041816-generic_4.18.16-041816.201810200431_amd64.deb
linux-headers-4.18.16-041816_4.18.16-041816.201810200431_all.deb
linux-image-unsigned-4.18.16-041816-generic_4.18.16-041816.201810200431_amd64.deb
?
or i can put all the kernels of the sub version immediately
sudo apt install linux-image-4.18.0* linux-headers-4.18.0*
?

Daniel van Vugt (vanvugt) wrote :

For any given directory/folder on that site, scroll down to:

  Build for amd64 succeeded

and then download/install that set of packages, except for the "lowlatency" ones. You don't need "lowlatency".

Ivan Grigoryev (ivvanvg) wrote :

Ok, I consistently checked the kernels from version 10.18.20 to 10.18.10. There is no bug at 10.18.10. If you need it later, I can check on versions from 10.18.9 to 10.18.1.
What need to provide logs on the working core?

Ivan Grigoryev (ivvanvg) wrote :

sorry typo, kernel version 4.18

Daniel van Vugt (vanvugt) wrote :

Thanks. If there is no bug at kernel 4.18.10, did you verify the bug is present in 4.18.11?

Ivan Grigoryev (ivvanvg) wrote :

I rechecked all kernels of version 4.18 again. No bug kernels only:
4.18.0-10-generic #11-Ubuntu SMP Thu Oct 11 15:13:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
4.18.12-041812-generic #201810032137 SMP Thu Oct 4 01:39:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
4.18.11-041811-generic #201809290731 SMP Sat Sep 29 11:33:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
4.18.10-041810-generic #201809260332 SMP Wed Sep 26 07:34:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
On all others starting 4.18.1 ending 4.18.20 there is a bug.

summary: - [amdgpu] Flickering graphics corruption in kernel 4.18.0-13.14-generic
- 4.18.17 (but none observed in kernel 4.18.0-10)
+ [amdgpu] Flickering graphics corruption (but none observed in kernels
+ 4.18.10-4.18.12)
Changed in linux (Ubuntu):
status: Incomplete → New
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: Incomplete → New

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Confirmed
LaoPiSai (laopisai) wrote :

This also happens on my system with RX480

Kai-Heng Feng (kaihengfeng) wrote :

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v5.0-rc7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".

Thanks in advance.

[0] https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.0-rc7/

Ivan Grigoryev (ivvanvg) wrote :

I tried the v5.0-rc7 and rc8 kernels. the problem is not solved

tags: added: kernel-bug-exists-upstream
Kai-Heng Feng (kaihengfeng) wrote :

Would it be possible for you to do a kernel bisection?

First, find the last good -rc kernel and the first bad -rc kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Then,
$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.

Ivan Grigoryev (ivvanvg) wrote :

Why look for -rc? if there is no problem only in the kernel version 4.18. (10, 11, 12), in all other kernel versions 4.18 there is a bug

Anonymous (tryninja) wrote :

Can confirm this is happening on fresh install of Linux Mint and Manjaro as well. Tried installing Kernel 5.0.2 but no fix.

Caleb Brinkley (csbrink) on 2019-03-15
affects: archlinux → linux (Arch Linux)
no longer affects: linux (Arch Linux)
Anonymous (tryninja) wrote :

I can confirm kernel 4.18.10 there is no flickering.

I tested this in ubuntu 19.04 beta (core 5.0.0-7), artifacts are present.
Currently this error is missing exclusively in the core 4.18.0-10.

https://youtu.be/-TRmXnBtx-g

Kai-Heng Feng (kaihengfeng) wrote :

$ git log --pretty=oneline v4.18.12..v4.18.13 drivers/gpu/drm/amd
d9ef158adf04b81772a7e9d682a054614ebac2fd Revert "drm/amd/pp: Send khz clock values to DC for smu7/8"
5113d730a1ee26bb13cbe38fdc10a7ebe7baf3cd drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk
ed14acd316bae2d3365f71ea9941490a71d3adf2 drm/amdgpu: Fix SDMA hang in prt mode v2

Only three commits to test, do you know how to build a kernel?

Kai-Heng Feng (kaihengfeng), not, i don't know.
Please tell me as much as possible.
I don't speak English very well, but I'll try to figure it out.

The issue is somewhere in the dynamic power management code. Here's a workaround that has completely resolved the flickering (for me):

As root:
echo "high" > /sys/class/drm/card0/device/power_dpm_force_performance_level

Found here: https://wiki.archlinux.org/index.php/AMDGPU#Screen_artifacts_and_frequency_problem

LaoPiSai (laopisai) wrote :

Drew Walton (drewwalton19216801), Thanks! it works.

Currently I run this command on startup using systemd.
Is there a simpler way?

Drew, Thanks!

It works for me. My system kubuntu 18.10 (kernel 4.18.0-17) and Sapphire RX580 Nitro +.

I hope that this bug will be fixed soon.

LaoPiSai, i found how to set this setting at system startup. Not tested yet, but it should work.
Try to do something based on this post: https://ryanclouser.com/2017/05/06/AMDGPU-Overclock-on-Startup/.

Kai-Heng Feng (kaihengfeng) wrote :

$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
$ cd linux
$ git checkout ed14acd316bae2d3365f71ea9941490a71d3adf2
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.

Then try the other two commits.

Kai-Heng Feng (kaihengfeng),i built kernel with commit ed14acd316bae2d3365f71ea9941490a71d3adf2.

I was lucky and from the first attempt the bug was fixed =) Current kernel version 4.18.12+.

Need to test the other two commits?

Ivan Grigoryev (ivvanvg) wrote :

But following the description, the commit canceled the previous patch due to some problems. I think you need a more detailed study of the problem

I also tested the d9ef158adf04b81772a7e9d682a054614ebac2fd commit. There is a problem in it.

The commit 5113d730a1ee26bb13cbe38fdc10a7ebe7baf3cd could not be tested. An error occurred: "fatal: reference is not a tree: 5113d730a1ee26bb13cbe38fdc10a7ebe7baf3cd".

There is no problem in the ed14acd316bae2d3365f71ea9941490a71d3adf2 commit.

Kai-Heng Feng (kaihengfeng) wrote :

Commit c3cb424a0869 ("drm/amd/pp: Send khz clock values to DC for smu7/8
") is in mainline but mainline still has the issue?

Please file an upstream bug at https://bugs.freedesktop.org/
Product: DRI
Component: DRM/amdgpu

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.