[amdgpu] MATE desktop is corrupted upon login on Ryzen APU

Bug #1882525 reported by Thomas Szymczak
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
marco (Ubuntu)
Invalid
Undecided
Unassigned
mesa (Ubuntu)
Confirmed
Undecided
Unassigned
xorg-server (Ubuntu)
Confirmed
Undecided
Unassigned
xserver-xorg-video-amdgpu (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Today I upgraded from Ubuntu MATE 19.10 to 20.04. After doing so, Ubuntu boots normally to the display manager (login screen) as expected. But once I log in, everything on the screen is cut up into horizontal slices, with each slice offset farther right than the slice above, and squares of random colors running diagonally across the screen. I've attached a picture of my monitor showing the issue for reference.
My hardware is a home-built PC with an ASUS Prime B450M motherboard, a Ryzen 3 2200G APU, and using that APU's integrated graphics.
In terms of software I'm running Ubuntu 20.04, which I just upgraded to and did a software update on. I have Linux 5.4.0-33-generic, X.Org 1.20.8, and Mesa 20.0.4. Before I upgraded Ubuntu, I was affected by Bug #1880041 but it seems fixed in the new kernel. I don't believe they're related but I figure it's worth mentioning.
I tried adding the nomodeset kernel option at boot, and it's an effective workaround. Also, I tried booting from an Ubuntu Mate 20.04 live USB to see if this was caused by some quirk of my particular software setup, and successfully reproduced the bug, proving it wasn't.
If you'd like me to, I can provide log files, system information, etc. or help narrow it down by testing workarounds.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: xserver-xorg-core 2:1.20.8-2ubuntu2.1
ProcVersionSignature: Ubuntu 5.4.0-33.37-generic 5.4.34
Uname: Linux 5.4.0-33-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.2
Architecture: amd64
CasperMD5CheckResult: skip
CompositorRunning: None
Date: Mon Jun 8 06:41:47 2020
DistUpgraded: Fresh install
DistroCodename: focal
DistroVariant: ubuntu
ExecutablePath: /usr/lib/xorg/Xorg
ExtraDebuggingInterest: Yes
GraphicsCard:
 Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c8) (prog-if 00 [VGA controller])
   Subsystem: ASUSTeK Computer Inc. Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1043:876b]
InstallationDate: Installed on 2019-10-20 (231 days ago)
InstallationMedia: Ubuntu-MATE 19.10 "Eoan Ermine" - Release amd64 (20191017)
MachineType: System manufacturer System Product Name
ProcEnviron: PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-33-generic root=UUID=43189257-cc94-4e98-ac1d-e16336dd152e ro quiet splash vt.handoff=7
SourcePackage: xorg-server
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/13/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2006
dmi.board.asset.tag: Default string
dmi.board.name: PRIME B450M-A
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2006:bd11/13/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEB450M-A:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.101-2
version.libgl1-mesa-dri: libgl1-mesa-dri 20.0.4-2ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 20.0.4-2ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.20.8-2ubuntu2.1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20200226-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.16-1

Revision history for this message
Thomas Szymczak (tszymczak) wrote :
Revision history for this message
Thomas Szymczak (tszymczak) wrote :

A quick update. I tested to see if kernel version affects this bug, and it appears not to. I booted up the previous installed kernel on my system (5.3.0-55) without the default options and got the exact same issue. Likewise, using the old kernel with nomodeset eliminates the corruption at the expense of hardware acceleration (inxi -G says my renderer is llvmpipe). As I didn't have this issue using that kernel on 19.10, I believe this points to something in the userspace that was updated when I went to 20.04.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [amdgpu] Display is corrupted upon login on Ryzen APU

Please try selecting 'Ubuntu on Wayland' from the login screen and tell us if that avoids the problem.

summary: - Display is corrupted upon login on Ryzen APU
+ [amdgpu] Display is corrupted upon login on Ryzen APU
affects: xorg-server (Ubuntu) → xserver-xorg-video-amdgpu (Ubuntu)
tags: added: amdgpu
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in mutter (Ubuntu):
status: New → Incomplete
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Incomplete
no longer affects: mutter (Ubuntu)
Revision history for this message
Thomas Szymczak (tszymczak) wrote :

Now this is where it gets weird. Since I have Ubuntu MATE, I installed the ubuntu-desktop package to get the default GNOME environment. I kept my display manager as LightDM in case that's relevant.

Then I rebooted and tested all 3 desktop modes with default kernel options. As expected, MATE is still broken. I rebooted and tried "Ubuntu (Default)" and there were no issues whatsoever. I opened a few apps and all of them displayed OK, and used inxi -G to verify that the display server was indeed X. I rebooted again to test "Ubuntu on Wayland" and it behaved as expected too. Whatever's going wrong, it's limited to MATE, as even GNOME on X is fine.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Maybe it's the 'marco' window manager then?

no longer affects: linux (Ubuntu)
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: Incomplete → New
summary: - [amdgpu] Display is corrupted upon login on Ryzen APU
+ [amdgpu] MATE desktop is corrupted upon login on Ryzen APU
Revision history for this message
Thomas Szymczak (tszymczak) wrote :

Today I had some more time to play around with my computer so I tried testing your hypothesis. I rebooted into my current software configuration with no workarounds as a control, and reproduced the bug as expected. Then I downgraded my system to the Marco version from 19.10 (1.22.3) by downloading it from the Ubuntu packages site, along with the necessary versions of libmarco-private2 and marco-common. Then I installed the .deb's with dpkg, and rebooted. The bug was still triggered so perhaps that suggests it's not a bug in Marco? Anything else I could try testing to narrow it down? Obviously it's confusing because GNOME doesn't trigger it, but Marco seems to regardless of version.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

That's not surprising. Any bug in MATE, or bug experienced via Marco, might be years old and also present in Ubuntu 19.10. But more likely this bug is the combined result of MATE + AMD meaning it might have only just become noticeable in 20.04.

Revision history for this message
L3P3 (l3p3) wrote :
Download full text (5.9 KiB)

Hello, I am using amdgpu and mate on debian, but I have the very same issue. The only way to fix it was to downgrade the Kernel to 5.3.0. I already wrote to the amdgpu devs but got no response... My message was, back then:

Hi, on my netbook (debian bullseye, AMD A4, Sea Islands), after
updating the kernel to version 5.4.6 and logging into mate desktop, my
screen looked like this (see attached picture).
If you look at the image, do you have any idea what is happening
there? To me it looks like the framebuffer data is misinterpreted at
some stage. The picture looks good on the lightdm login screen, it is
only corrupted when I start a user session and when I then switch to a
console (ctrl+1), the screen looks correct for a second before it
switches. When I switch back (ctrl+7), it looks correct for a second
and afterwards, it turns corrupted again. I was looking into these
commits a bit but I don't have any idea... Maybe mate/marco is doing
something it shouldn't but then, much more people would have problems
now...
I took a screenshot but on that, everything looked fine, that is why I
took a picture. I assume it came by one of these commits, since these
were the only amdgpu changes between a working and a non-working
kernel:

commit 9375fa3799293da82490f0f1fa1f1e7fabae2745
Author: changzhu <email address hidden>
Date: Tue Dec 10 22:00:59 2019 +0800

    drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9

    commit 90f6452ca58d436de4f69b423ecd75a109aa9766 upstream.

    It may fail to load guest driver in round 2 or cause Xstart problem
    when using invalidate semaphore for SRIOV or picasso. So it needs avoid
    using invalidate semaphore for SRIOV and picasso.

    Signed-off-by: changzhu <email address hidden>
    Reviewed-by: Christian König <email address hidden>
    Reviewed-by: Huang Rui <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit bf8ae461a23577a9884f993b31b31f15dd7d6c0a
Author: changzhu <email address hidden>
Date: Tue Dec 10 10:23:09 2019 +0800

    drm/amdgpu: avoid using invalidate semaphore for picasso

    commit 413fc385a594ea6eb08843be33939057ddfdae76 upstream.

    It may cause timeout waiting for sem acquire in VM flush when using
    invalidate semaphore for picasso. So it needs to avoid using invalidate
    semaphore for piasso.

    Signed-off-by: changzhu <email address hidden>
    Reviewed-by: Huang Rui <email address hidden>
    Signed-off-by: Alex Deucher <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

commit f45858245286fa901e8abf36776df7e99d9a9581
Author: Xiaojie Yuan <email address hidden>
Date: Wed Nov 20 14:02:22 2019 +0800

    drm/amdgpu/gfx10: re-init clear state buffer after gpu reset

    commit 210b3b3c7563df391bd81d49c51af303b928de4a upstream.

    This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.

    clear state buffer (resides in vram) is corrupted after 1st baco reset,
    upon gfxoff exit, CPF gets garbage header in CSIB and hangs.

    Signed-off-by: Xiaojie Yuan <email address hidden>
    R...

Read more...

Revision history for this message
L3P3 (l3p3) wrote :

And for completeness, here is what my screen looks like:

Revision history for this message
Thomas Szymczak (tszymczak) wrote :

I've discovered some useful info about this bug. First, there's a forum thread about it with lots of good info: https://ubuntu-mate.community/t/20-04-display-issues-with-amd-gpu/21648/37

Second, a new workaround. To recap, the first workaround I found was booting with nomodeset, but that disables all hardware acceleration. The new workaround, as detailed in the forum thread, is to open MATE Tweak, and in the Windows tab, change the window manager to Marco (no compositor).

Third, we found the upstream bug in xf86-video-amdgpu: https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/-/issues/10 . I believe this is packaged as xserver-xorg-video-amdgpu in Ubuntu. It's been fixed upstream and the Ubuntu Mate devs will include it in 20.04.1 is possible.

Fourth, bug #1873895 appears to be the same bug, but originally reported as manifesting in Xubuntu. Maybe this is a duplicate.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in marco (Ubuntu):
status: New → Confirmed
Changed in mesa (Ubuntu):
status: New → Confirmed
Changed in xorg-server (Ubuntu):
status: New → Confirmed
Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Confirmed
Changed in marco (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.