Comment 44 for bug 946899

Revision history for this message
Otto Kekäläinen (otto) wrote : Re: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung

I've been running Ubuntu 12.04 for over a year now without problems on my home computer, but since about a week ago when I ran the update installer, the graphical system has started to crash randomly as described in this bug report.

Symptoms (appear randomly, only one at the time):
- System freezes completely during late stages of startup. Reboot recovers.
- System might start and everything work, but Apport dialogs appear one after another. Even if you complete the the Apport report, new dialogs about the exact same Intel GPU problem reapper. Reboot recovers the insane Apport loop.
- During use window decorations disappear. Window contents partially respond to mouse, but as the window manager seems dead, system must be rebooted to recover.

In kernel and syslog I've found the lines like these appear:
Mar 31 12:04:39 htpc kernel: [ 9026.908141] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 31 12:04:39 htpc kernel: [ 9026.908150] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Mar 31 12:04:39 htpc kernel: [ 9026.910800] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 2211040 at 2211035, next 2211041)
Mar 31 12:04:39 htpc kernel: [ 9026.916260] HDMI hot plug event: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=1
Mar 31 12:04:39 htpc kernel: [ 9026.916309] HDMI status: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=0
Mar 31 12:04:40 htpc kernel: [ 9027.115958] atl1c 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.

The HDMI hot plug event is part of the bug. All cables are connected at all times and the driver should not get any hotplug event.

The main problem is the GPU hang.

Kernel:
Linux htpc 3.2.0-39-generic #62-Ubuntu SMP Thu Feb 28 00:28:53 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Graphics card:
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
 Subsystem: ASUSTeK Computer Inc. Device 844d
 Flags: bus master, fast devsel, latency 0, IRQ 53
 Memory at fb400000 (64-bit, non-prefetchable) [size=4M]
 Memory at d0000000 (64-bit, prefetchable) [size=256M]
 I/O ports at f000 [size=64]
 Expansion ROM at <unassigned> [disabled]
 Capabilities: <access denied>
 Kernel driver in use: i915
 Kernel modules: i915

It is interesting that this bug seems to have appeared to both 3.2, 3.5 and 3.8 -version kernels. Maybe it is related to some security fix applied recently? Or perhaps the root cause is not in the kernel package, but in some compiz or X update?