GPU hang on Alder Lake laptop

Bug #1998950 reported by brian m. carlson
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Linux
New
Unknown
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

I see frequent (multiple times per day) hangs on my Lenovo ThinkPad X1 Carbon Gen 10. When this occurs, part of the image tears away and X becomes unusable. Sometimes the cursor continues to move for a short time after the fact. In order to recover, I must SSH into the machine and run `sudo killall -9 Xorg`, which drops me back to the lightdm login screen and then things work again.

I've also seen this on Debian sid on a nearly identical machine, and there when upgrading to kernel 6.0 and Mesa 22.3, the problem disappears. However, those are not available in Kinetic.

I've tried both `i915.enable_psr=0` and `i915.enable_dc=0` as boot parameters and this does not affect anything. The problem has been occurring since I installed Ubuntu on this machine when I got it on November 17.

I believe the upstream bug report is this: https://gitlab.freedesktop.org/drm/intel/-/issues/6757.

ProblemType: Bug
DistroRelease: Ubuntu 22.10
Package: xorg 1:7.7+23ubuntu2
ProcVersionSignature: Ubuntu 5.19.0-26.27-generic 5.19.7
Uname: Linux 5.19.0-26-generic x86_64
ApportVersion: 2.23.1-0ubuntu3
Architecture: amd64
BootLog: Error: [Errno 13] Permission denied: '/var/log/boot.log'
CasperMD5CheckResult: pass
CompositorRunning: None
CurrentDesktop: MATE
Date: Tue Dec 6 16:42:04 2022
DistUpgraded: Fresh install
DistroCodename: kinetic
DistroVariant: ubuntu
ExtraDebuggingInterest: Yes
GpuHangFrequency: Several times a day
GpuHangReproducibility: Seems to happen randomly
GpuHangStarted: Immediately after installing this version of Ubuntu
GraphicsCard:
 Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c) (prog-if 00 [VGA controller])
   Subsystem: Lenovo Alder Lake-P Integrated Graphics Controller [17aa:22e7]
InstallationDate: Installed on 2022-11-17 (18 days ago)
InstallationMedia: Ubuntu 22.10 "Kinetic Kudu" - Release amd64 (20221020)
MachineType: LENOVO 21CBCTO1WW
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.19.0-26-generic root=/dev/mapper/vgubuntu-root ro i915.enable_dc=0 quiet splash i915.enable_dc=0 vt.handoff=7
SourcePackage: xorg
Symptom: display
Title: Xorg freeze
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/02/2022
dmi.bios.release: 1.30
dmi.bios.vendor: LENOVO
dmi.bios.version: N3AET65W (1.30 )
dmi.board.asset.tag: Not Available
dmi.board.name: 21CBCTO1WW
dmi.board.vendor: LENOVO
dmi.board.version: SDK0T76461 WIN
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.ec.firmware.release: 1.14
dmi.modalias: dmi:bvnLENOVO:bvrN3AET65W(1.30):bd08/02/2022:br1.30:efr1.14:svnLENOVO:pn21CBCTO1WW:pvrThinkPadX1CarbonGen10:rvnLENOVO:rn21CBCTO1WW:rvrSDK0T76461WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21CB_BU_Think_FM_ThinkPadX1CarbonGen10:
dmi.product.family: ThinkPad X1 Carbon Gen 10
dmi.product.name: 21CBCTO1WW
dmi.product.sku: LENOVO_MT_21CB_BU_Think_FM_ThinkPad X1 Carbon Gen 10
dmi.product.version: ThinkPad X1 Carbon Gen 10
dmi.sys.vendor: LENOVO
version.compiz: compiz N/A
version.libdrm2: libdrm2 2.4.113-2
version.libgl1-mesa-dri: libgl1-mesa-dri 22.2.1-1ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 22.2.1-1ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:21.1.4-2ubuntu1.2
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:19.1.0-3
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20210115-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.17-2build1

Revision history for this message
brian m. carlson (bk2204) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Download full text (4.3 KiB)

From the attached kernel log:

[10112.178526] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:859ffffb, in Xorg [1998]
[10112.179245] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
[10112.282534] i915 0000:00:02.0: [drm] Xorg[1998] context reset due to GPU hang
[10112.282711] i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
[10112.282720] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
[10112.299590] i915 0000:00:02.0: [drm] HuC authenticated
[10112.299976] i915 0000:00:02.0: [drm] GuC submission enabled
[10112.299977] i915 0000:00:02.0: [drm] GuC SLPC enabled
[10124.168247] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6 timed out (hint:intel_atomic_commit_ready [i915])
[10124.168641] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6 timed out (hint:intel_atomic_commit_ready [i915])
[10135.176271] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6a timed out (hint:intel_atomic_commit_ready [i915])
[10135.176691] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6a timed out (hint:intel_atomic_commit_ready [i915])
[10145.928345] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6c timed out (hint:intel_atomic_commit_ready [i915])
[10145.928723] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:6c timed out (hint:intel_atomic_commit_ready [i915])
[10157.192386] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:74 timed out (hint:intel_atomic_commit_ready [i915])
[10157.192755] Asynchronous wait on fence 0000:00:02.0:Xorg[1998]:74 timed out (hint:intel_atomic_commit_ready [i915])

and

[12194.234423] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:859ffffb, in Xorg [205503]
[12194.235106] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
[12194.338868] i915 0000:00:02.0: [drm] Xorg[205503] context reset due to GPU hang
[12194.339052] i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
[12194.339060] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
[12194.355616] i915 0000:00:02.0: [drm] HuC authenticated
[12194.356153] i915 0000:00:02.0: [drm] GuC submission enabled
[12194.356158] i915 0000:00:02.0: [drm] GuC SLPC enabled
[12203.157415] Asynchronous wait on fence 0000:00:02.0:Xorg[205503]:fa3e timed out (hint:intel_atomic_commit_ready [i915])
[12203.157795] Asynchronous wait on fence 0000:00:02.0:Xorg[205503]:fa3e timed out (hint:intel_atomic_commit_ready [i915])
[12214.374853] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa2c!
[12214.374867] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa2a!
[12214.374870] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa28!
[12214.374872] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa26!
[12214.374875] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa24!
[12214.374877] Fence expiration time out i915-0000:00:02.0:Xorg<205503>:fa22!
[12214.933492] Asynchronous wait on fence 0000:00:02.0:Xorg[205503]:c timed out (hint:intel_atomic_commit_ready [i915])
[12214.933892] Asynchronous wait on fence 0000:00:02.0:Xorg<205503>:fa3e timed out (hint:intel_atomic_commit_ready [i915])
[12214.934191] Asynchronous wait o...

Read more...

affects: xorg (Ubuntu) → linux (Ubuntu)
Changed in linux:
status: Unknown → New
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in mesa (Ubuntu):
status: New → Confirmed
Revision history for this message
brian m. carlson (bk2204) wrote :

Since there appears to be upstream patches in kernel 6.0 and Mesa 22.3.1 which fix this, perhaps it would be possible to backport those to the versions in Ubuntu 22.10? If not, would it at least be possible to upload such packages to Lunar so that there's an option which doesn't involve reinstalling the machine to go back to 22.04?

Right now my machine is freezing multiple times per day and it would be really valuable to get those backports applied. I'm pretty sure this affects all Alder Lake-P GPUs on Kinetic, since the upstream bug report mentions multiple different hardware variants.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If anyone was using jammy and experiencing this then maybe consider:
https://launchpad.net/ubuntu/+source/linux-signed-oem-6.0

Otherwise the unsupported options:
https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=M;O=D

Revision history for this message
brian m. carlson (bk2204) wrote :

I don't see this happening anymore as of the release of linux-image-5.19.0-28-generic. Neither the -28 nor the -29 kernel seem to cause this to happen anymore, so I suspect this can probably be closed.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. I did have a configuration in which I could reproduce this recently so myself or someone else should look at confirming it's really fixed.

I wasn't too worried about this bug though because:

  * It never happens on 22.04
  * 23.04 will get kernels 6.1 and 6.2 in the coming months

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in mesa (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Hmm perhaps bug 2001914 needs revisiting though (22.04)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Yeah I can't see this bug anymore on kinetic with kernel 5.19.0-29-generic. So this is closed and bug 2001914 separated.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
no longer affects: mesa (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.