More amdgpu crashes: pagefaults, ring timeouts and parser bugs

Bug #2045899 reported by Julian Andres Klode
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
mesa (Ubuntu)
New
Undecided
Unassigned

Bug Description

I think running the Mattermost snap forced to Wayland rendering *eventually* seems to crash amdgpu, it recovers at the kernel level but the GUI doesn't actually recover and I also can switch tty with the keyboard.

I have attached the two crashes from yesterday, they look slightly different but may be the same cause:

LastDmesg.txt contains the journalctl -k -b of the last failed boot, OlderDmesg.txt contains the same for the crash before that.

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: linux-image-6.5.0-14-generic 6.5.0-14.14
ProcVersionSignature: Ubuntu 6.5.0-14.14-generic 6.5.3
Uname: Linux 6.5.0-14-generic x86_64
NonfreeKernelModules: zfs
ApportVersion: 2.27.0-0ubuntu6
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: GNOME
Date: Thu Dec 7 13:09:42 2023
InstallationDate: Installed on 2022-11-26 (376 days ago)
InstallationMedia: Ubuntu 23.04 "Lunar Lobster" - Alpha amd64 (20221126)
MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.5.0-14-generic root=/dev/mapper/ubuntu-root ro rootflags=subvol=@ quiet splash zswap.enabled=1 zswap.compressor=zstd zswap.max_pool_percent=20 zswap.zpool=zsmalloc vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-6.5.0-14-generic N/A
 linux-backports-modules-6.5.0-14-generic N/A
 linux-firmware 20230919.git3672ccab-0ubuntu2.2
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/14/2023
dmi.bios.release: 1.47
dmi.bios.vendor: LENOVO
dmi.bios.version: R23ET71W (1.47 )
dmi.board.asset.tag: Not Available
dmi.board.name: 21CF004PGE
dmi.board.vendor: LENOVO
dmi.board.version: SDK0T76538 WIN
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.ec.firmware.release: 1.28
dmi.modalias: dmi:bvnLENOVO:bvrR23ET71W(1.47):bd09/14/2023:br1.47:efr1.28:svnLENOVO:pn21CF004PGE:pvrThinkPadT14Gen3:rvnLENOVO:rn21CF004PGE:rvrSDK0T76538WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21CF_BU_Think_FM_ThinkPadT14Gen3:
dmi.product.family: ThinkPad T14 Gen 3
dmi.product.name: 21CF004PGE
dmi.product.sku: LENOVO_MT_21CF_BU_Think_FM_ThinkPad T14 Gen 3
dmi.product.version: ThinkPad T14 Gen 3
dmi.sys.vendor: LENOVO

Revision history for this message
Julian Andres Klode (juliank) wrote :
Revision history for this message
Julian Andres Klode (juliank) wrote :
Revision history for this message
Julian Andres Klode (juliank) wrote :

I saw similar messages in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2032386 but that only caused hangs and not block the entire desktop (and Firefox is good now so I don't know)

Revision history for this message
Julian Andres Klode (juliank) wrote :

These ones I don't remember seeing before:

[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

please try a later point release mainline kernel:

https://kernel.ubuntu.com/mainline/v6.5.13/

which is also the last one for 6.5

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Julian Andres Klode (juliank) wrote :

Waiting for 6.5.13 to actually build but on 6.6.0-14 from proposed now which is based on 6.6.3 which probably has all the fancy patches too

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.