Comment 0 for bug 2018470

Revision history for this message
Thomas Debesse (illwieckz) wrote : Strong amdgpu and btrfs regressions in Linux 5.19 making Zen2 TR PRO workstation unusable

The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23.

Here are the versions I tested:

- 5.19.0-23
- 5.19.0-29
- 5.19.0-31
- 5.19.0-42

In that list, only Linux 5.19.0-23 is working with that computer.

There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23.

I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional).

The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine.

I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear.

It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours.

The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket.

Here are some details on the hardware:

- MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10
- RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E
- CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2)
- GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver)
- GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver)
- GPU: ASPEED graphic Family rev 41

The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble.

When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards.

When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year).

So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs.

The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first one reproducing the graphic bug (the repository doesn't provide 5.19.0-30 for me to test).

I also have reproduced the graphic bug when using the radeon driver instead of the amdgpu one.

ProblemType: Bug
DistroRelease: Ubuntu 22.10
Package: linux-image-generic 5.19.0.42.38
ProcVersionSignature: Ubuntu 5.19.0-23.24-generic 5.19.7
Uname: Linux 5.19.0-23-generic x86_64
ApportVersion: 2.23.1-0ubuntu3.3
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: GNOME
Date: Thu May 4 11:52:02 2023
HibernationDevice: RESUME=none
MachineType: Default string Default string
ProcEnviron:
 LANGUAGE=fr_FR:en
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
ProcFB:
 0 astdrmfb
 1 amdgpudrmfb
 2 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.19.0-23-generic root=UUID=f35ecf77-511e-4dde-ac11-c1d848e97315 ro rootflags=subvol=@ amdgpu.si_support=1 radeon.si_support=0 amdgpu.cik_support=1 radeon.cik_support=0 amdgpu.exp_hw_support=1 amdgpu.gpu_recovery=1 amdgpu.ppfeaturemask=0xffffffff delayacct zswap.enabled=1
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.19.0-23-generic N/A
 linux-backports-modules-5.19.0-23-generic N/A
 linux-firmware 20220923.gitf09bebf3-0ubuntu1.6
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/04/2022
dmi.bios.release: 5.23
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: WRX80PRO-F1
dmi.board.asset.tag: Default string
dmi.board.name: Default string
dmi.board.vendor: Default string
dmi.board.version: Default string
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvrWRX80PRO-F1:bd08/04/2022:br5.23:svnDefaultstring:pnDefaultstring:pvrDefaultstring:rvnDefaultstring:rnDefaultstring:rvrDefaultstring:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:
dmi.product.family: Default string
dmi.product.name: Default string
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Default string
modified.conffile..etc.default.apport: [modified]
mtime.conffile..etc.default.apport: 2018-06-16T17:39:00.798346