[i915gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x7f9c002d)

Bug #720468 reported by Eakan Gopalakrishnan on 2011-02-17
268
This bug affects 39 people
Affects Status Importance Assigned to Milestone
xserver-xorg-video-intel (Ubuntu)
High
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

the system turns on and when i am about to log on, the whole system freezes. only my mouse moves around. everything else is stuck. i don't know what is wrong. so nowadays i run this in recovery mode in low graphics mode.

[ 50.552156] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 50.554344] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 2197 at 2195, next 2198)

ACTHD: 0xffffffff
EIR: 0x00000000
EMR: 0xffffffed
ESR: 0x00000001
PGTBL_ER: 0x00000000
IPEHR: 0x7f9c002d
IPEIR: 0x00000000
INSTDONE: 0x0307f8c1
    busy: IDCT
    busy: IQ
    busy: PR
    busy: VLD
    busy: Instruction parser
    busy: Strips and fans
    busy: Setup engine
    busy: Windowizer
    busy: Intermediate Z
    busy: Perspective interpolation
    busy: Bypass FIFO
    busy: Pixel shader
    busy: Color calculator

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-1ubuntu9
ProcVersionSignature: Ubuntu 2.6.38-2.29-generic 2.6.38-rc3
Uname: Linux 2.6.38-2-generic i686
Architecture: i386
Chipset: i915gm
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1024x768
 edid-base64: AP///////wAGr1EkAAAAAAEPAQOAHhd4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBZBkAQEEAJjAYiDYAMOQQAAAYAAAADwAAAAAAAAAAAAAAAAAgAAAA/gBBVU8KICAgICAgICAgAAAA/gBCMTUwWEcwMiBWNCAKAAA=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: On
 modes:
 edid-base64:
Date: Wed Feb 16 23:49:25 2011
DistUpgraded: Yes, recently upgraded Log time: 2011-02-08 07:26:54.874339
DistroCodename: natty
DistroVariant: ubuntu
DumpSignature: 554e2b4b (ESR: 0x00000001 IPEHR: 0x7f9c002d)
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
GraphicsCard:
 Subsystem: Acer Incorporated [ALI] Device [1025:008f]
   Subsystem: Acer Incorporated [ALI] Device [1025:008f]
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
InterpreterPath: /usr/bin/python2.7
Lsusb:
 Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Acer, inc. TravelMate 4060
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-2-generic root=UUID=966ce8dc-e914-408c-bb96-a71a2e49559f ro quiet splash vt.handoff=7
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.38-2-generic root=UUID=966ce8dc-e914-408c-bb96-a71a2e49559f ro single
RelatedPackageVersions:
 xserver-xorg 1:7.6~3ubuntu4
 libdrm2 2.4.23-1ubuntu3
 xserver-xorg-video-intel 2:2.14.0-1ubuntu9
SourcePackage: xserver-xorg-video-intel
Title: [i915gm] GPU lockup 554e2b4b (ESR: 0x00000001 IPEHR: 0x7f9c002d)
UserGroups:

dmi.bios.date: 09/03/05
dmi.bios.vendor: Acer
dmi.bios.version: 3A05
dmi.board.name: LuganoII
dmi.board.vendor: Acer, Inc.
dmi.board.version: Not Applicable
dmi.chassis.type: 1
dmi.chassis.vendor: Acer, Inc.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnAcer:bvr3A05:bd09/03/05:svnAcer,inc.:pnTravelMate4060:pvrNotApplicable:rvnAcer,Inc.:rnLuganoII:rvrNotApplicable:cvnAcer,Inc.:ct1:cvrN/A:
dmi.product.name: TravelMate 4060
dmi.product.version: Not Applicable
dmi.sys.vendor: Acer, inc.
version.compiz: compiz 1:0.9.2.1+glibmainloop4-0ubuntu11
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10-1ubuntu3
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu4
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.13.2+git20110124.fadee040-0ubuntu4
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-1ubuntu9
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu4

Eakan Gopalakrishnan (eakangk) wrote :
Bryce Harrington (bryce) on 2011-02-17
description: updated

This happens to me on the Asus EEE PC 701. It takes longer to happen. The system freezes except for the mouse after 10 minutes while browsing with Chrome:

Feb 18 09:29:56 eee kernel: [ 681.040034] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Feb 18 09:29:56 eee kernel: [ 681.044561] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 79290 at 79288, next 79291)

Bryce Harrington (bryce) wrote :

Here is a kernel with a DEBUG patch added (note this is not a fix, just a mechanism to confirm the root cause). Could you affected please test out these kernels (they should work on Maverick too) and report back here. The kernels are at the URL below:

    http://people.canonical.com/~apw/lp714719-natty/

After booting this, please do whatever you do to reproduce the freeze (if you can), then ssh into the box and collect the output of 'dmesg > dmesg.txt' and post it here. If you can't reproduce the crash within say a day or so, report that back here too.

If it happens that you haven't seen a freeze at all for a few days (esp. if you haven't seen it after doing a system update), then it may have been solved via a mystery kernel fix. In that case please note so here and the bug report can be set to 'Fix Released'.

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete

Your debug kernel 2.6.38-3-generic #30lp714719v201102101355 did freeze once before I had ssh set up. It did not freeze again but dmesg indicates a hang:

 3023.928039] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 3023.930371] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 370145 at 370143, next 370146)
[ 3023.930571] [drm:i915_reset] *ERROR* Failed to reset chip.

There doesn't seem to be a lot of extra debug info in dmesg beyond what is in /var/log/kern.log?

When I run 2.6.38-4 the system freezes within a few minutes and I get a similar dmesg:

 1312.444041] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 1312.446392] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 108991 at 108989, next 108992)
[ 1312.446597] [drm:i915_reset] *ERROR* Failed to reset chip.

Bryce Harrington (bryce) on 2011-02-19
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Confirmed
Bryce Harrington (bryce) wrote :

Here is word from upstream on a different (but perhaps related) bug report:

"Bryce, for all the 915/945 bugs can you please have the reporters test the
latest kernel with the enlarged unfenced alignment. That's the most likely
cause of random writes, though I don't suspect it in this case."

summary: - [i915gm] GPU lockup 554e2b4b (ESR: 0x00000001 IPEHR: 0x7f9c002d)
+ [i915gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x7f9c002d)
Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
Bryce Harrington (bryce) wrote :

Please test with the following kernel:

  http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/

Chris says he has some fixes for similar issues in that version, and wants to rule those fixes out before investigating this lockup bug further.

Download full text (5.0 KiB)

i still get the gpu lockup error..

On Tue, Feb 22, 2011 at 11:02 PM, Bryce Harrington <
<email address hidden>> wrote:

> Please test with the following kernel:
>
> http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
>
> Chris says he has some fixes for similar issues in that version, and
> wants to rule those fixes out before investigating this lockup bug
> further.
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/720468
>
> Title:
> [i915gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x7f9c002d)
>
> Status in “xserver-xorg-video-intel” package in Ubuntu:
> Triaged
>
> Bug description:
> Binary package hint: xserver-xorg-video-intel
>
> the system turns on and when i am about to log on, the whole system
> freezes. only my mouse moves around. everything else is stuck. i don't
> know what is wrong. so nowadays i run this in recovery mode in low
> graphics mode.
>
> [ 50.552156] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 50.554344] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
> returns -11 (awaiting 2197 at 2195, next 2198)
>
> ACTHD: 0xffffffff
> EIR: 0x00000000
> EMR: 0xffffffed
> ESR: 0x00000001
> PGTBL_ER: 0x00000000
> IPEHR: 0x7f9c002d
> IPEIR: 0x00000000
> INSTDONE: 0x0307f8c1
> busy: IDCT
> busy: IQ
> busy: PR
> busy: VLD
> busy: Instruction parser
> busy: Strips and fans
> busy: Setup engine
> busy: Windowizer
> busy: Intermediate Z
> busy: Perspective interpolation
> busy: Bypass FIFO
> busy: Pixel shader
> busy: Color calculator
>
> ProblemType: Crash
> DistroRelease: Ubuntu 11.04
> Package: xserver-xorg-video-intel 2:2.14.0-1ubuntu9
> ProcVersionSignature: Ubuntu 2.6.38-2.29-generic 2.6.38-rc3
> Uname: Linux 2.6.38-2-generic i686
> Architecture: i386
> Chipset: i915gm
> DRM.card0.LVDS.1:
> status: connected
> enabled: enabled
> dpms: On
> modes: 1024x768
> edid-base64:
> AP///////wAGr1EkAAAAAAEPAQOAHhd4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBZBkAQEEAJjAYiDYAMOQQAAAYAAAADwAAAAAAAAAAAAAAAAAgAAAA/gBBVU8KICAgICAgICAgAAAA/gBCMTUwWEcwMiBWNCAKAAA=
> DRM.card0.VGA.1:
> status: disconnected
> enabled: disabled
> dpms: On
> modes:
> edid-base64:
> Date: Wed Feb 16 23:49:25 2011
> DistUpgraded: Yes, recently upgraded Log time: 2011-02-08 07:26:54.874339
> DistroCodename: natty
> DistroVariant: ubuntu
> DumpSignature: 554e2b4b (ESR: 0x00000001 IPEHR: 0x7f9c002d)
> ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
> GraphicsCard:
> Subsystem: Acer Incorporated [ALI] Device [1025:008f]
> Subsystem: Acer Incorporated [ALI] Device [1025:008f]
> InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
> InterpreterPath: /usr/bin/python2.7
> Lsusb:
> Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2...

Read more...

PMPope (pmpope) wrote :
Download full text (5.5 KiB)

I haven't had much chance to test the machine I applied the patch to. It
booted fine directly after applying the patch. I haven't booted it since
then. I just wanted to send a response since you were willing to help so
quickly. If it crashes on the next three boots (week/week & half) i will
follow the steps resubmit the crash data.

Thanks for your help

On Tue, Feb 22, 2011 at 6:02 PM, Bryce Harrington <<email address hidden>
> wrote:

> Please test with the following kernel:
>
> http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
>
> Chris says he has some fixes for similar issues in that version, and
> wants to rule those fixes out before investigating this lockup bug
> further.
>
> --
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (721331).
> https://bugs.launchpad.net/bugs/720468
>
> Title:
> [i915gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x7f9c002d)
>
> Status in “xserver-xorg-video-intel” package in Ubuntu:
> Triaged
>
> Bug description:
> Binary package hint: xserver-xorg-video-intel
>
> the system turns on and when i am about to log on, the whole system
> freezes. only my mouse moves around. everything else is stuck. i don't
> know what is wrong. so nowadays i run this in recovery mode in low
> graphics mode.
>
> [ 50.552156] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 50.554344] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
> returns -11 (awaiting 2197 at 2195, next 2198)
>
> ACTHD: 0xffffffff
> EIR: 0x00000000
> EMR: 0xffffffed
> ESR: 0x00000001
> PGTBL_ER: 0x00000000
> IPEHR: 0x7f9c002d
> IPEIR: 0x00000000
> INSTDONE: 0x0307f8c1
> busy: IDCT
> busy: IQ
> busy: PR
> busy: VLD
> busy: Instruction parser
> busy: Strips and fans
> busy: Setup engine
> busy: Windowizer
> busy: Intermediate Z
> busy: Perspective interpolation
> busy: Bypass FIFO
> busy: Pixel shader
> busy: Color calculator
>
> ProblemType: Crash
> DistroRelease: Ubuntu 11.04
> Package: xserver-xorg-video-intel 2:2.14.0-1ubuntu9
> ProcVersionSignature: Ubuntu 2.6.38-2.29-generic 2.6.38-rc3
> Uname: Linux 2.6.38-2-generic i686
> Architecture: i386
> Chipset: i915gm
> DRM.card0.LVDS.1:
> status: connected
> enabled: enabled
> dpms: On
> modes: 1024x768
> edid-base64:
> AP///////wAGr1EkAAAAAAEPAQOAHhd4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBZBkAQEEAJjAYiDYAMOQQAAAYAAAADwAAAAAAAAAAAAAAAAAgAAAA/gBBVU8KICAgICAgICAgAAAA/gBCMTUwWEcwMiBWNCAKAAA=
> DRM.card0.VGA.1:
> status: disconnected
> enabled: disabled
> dpms: On
> modes:
> edid-base64:
> Date: Wed Feb 16 23:49:25 2011
> DistUpgraded: Yes, recently upgraded Log time: 2011-02-08 07:26:54.874339
> DistroCodename: natty
> DistroVariant: ubuntu
> DumpSignature: 554e2b4b (ESR: 0x00000001 IPEHR: 0x7f9c002d)
> ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
> GraphicsCard:
> Subsystem: Acer Incorporated [ALI] Device [1025:008f]
> Subsystem: Acer Incorporated [ALI] Device [1025:008f]
> InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
> InterpreterPath: /usr/bin/python...

Read more...

The kernel 2.6.38-999-generic #201102231436 froze after half an hour of browsing on the eee pc. I rebooted it and saw within a minute apport wanted to submit a gpu related bug. However the entries in dmesg/kern.log related to the GPU hanging/resetting seem to be missing from the 999 kernel.

Bryce Harrington (bryce) on 2011-03-19
Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Confirmed
tikilou (tikilou) on 2011-03-20
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → Fix Released
Philip Taylor (scraliontis) wrote :
Download full text (8.0 KiB)

i still got gpu lockup, even after that fix, log below from dmesg.

Mar 20 16:00:05 ptaylor-eeepc-laptop kernel: [15801.504038] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 20 16:00:05 ptaylor-eeepc-laptop kernel: [15801.505494] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 2621922 at 2621919, next 2621923)
Mar 20 16:00:05 ptaylor-eeepc-laptop kernel: [15801.505810] [drm:i915_reset] *ERROR* Failed to reset chip.
Mar 20 16:00:28 ptaylor-eeepc-laptop kernel: [15824.556441] mutter[1412]: segfault at 20c ip b6368e96 sp bfcc2990 error 4 in i915_dri.so[b6348000+64000]
Mar 20 16:00:40 ptaylor-eeepc-laptop kernel: [15836.211600] mutter[26870]: segfault at 20c ip b6359e96 sp bf8d2e60 error 4 in i915_dri.so[b6339000+64000]
Mar 20 16:00:49 ptaylor-eeepc-laptop kernel: [15845.140570] mutter[27197]: segfault at 20c ip b63e6e96 sp bfe9f310 error 4 in i915_dri.so[b63c6000+64000]
Mar 20 16:01:03 ptaylor-eeepc-laptop kernel: [15859.454364] mutter[27443]: segfault at 20c ip b64f7e96 sp bfb0a040 error 4 in i915_dri.so[b64d7000+64000]
Mar 20 16:01:12 ptaylor-eeepc-laptop kernel: [15869.090295] mutter[27886]: segfault at 20c ip b63d1e96 sp bf982dc0 error 4 in i915_dri.so[b63b1000+64000]
Mar 20 16:01:21 ptaylor-eeepc-laptop kernel: [15877.936871] mutter[28213]: segfault at 20c ip b63bde96 sp bf813b90 error 4 in i915_dri.so[b639d000+64000]
Mar 20 16:01:31 ptaylor-eeepc-laptop kernel: [15887.203279] mutter[28490]: segfault at 20c ip b641be96 sp bfd7cfb0 error 4 in i915_dri.so[b63fb000+64000]
Mar 20 16:01:42 ptaylor-eeepc-laptop kernel: [15898.719584] mutter[28822]: segfault at 20c ip b6417e96 sp bfc956d0 error 4 in i915_dri.so[b63f7000+64000]
Mar 20 16:01:51 ptaylor-eeepc-laptop kernel: [15907.619998] mutter[29150]: segfault at 20c ip b63aee96 sp bf993360 error 4 in i915_dri.so[b638e000+64000]
Mar 20 16:02:00 ptaylor-eeepc-laptop kernel: [15916.477114] mutter[29395]: segfault at 20c ip b63a4e96 sp bfc70da0 error 4 in i915_dri.so[b6384000+64000]
Mar 20 16:02:10 ptaylor-eeepc-laptop kernel: [15926.159103] mutter[29690]: segfault at 20c ip b63d7e96 sp bf992010 error 4 in i915_dri.so[b63b7000+64000]
Mar 20 16:02:19 ptaylor-eeepc-laptop kernel: [15935.145388] mutter[29993]: segfault at 20c ip b6494e96 sp bf998a20 error 4 in i915_dri.so[b6474000+64000]
Mar 20 16:02:28 ptaylor-eeepc-laptop kernel: [15944.181309] mutter[30245]: segfault at 20c ip b644ee96 sp bfbe5100 error 4 in i915_dri.so[b642e000+64000]
Mar 20 16:02:37 ptaylor-eeepc-laptop kernel: [15953.491747] mutter[30535]: segfault at 20c ip b6450e96 sp bfd8f0d0 error 4 in i915_dri.so[b6430000+64000]
Mar 20 16:02:50 ptaylor-eeepc-laptop kernel: [15967.052734] mutter[30859]: segfault at 20c ip b63c0e96 sp bfd97810 error 4 in i915_dri.so[b63a0000+64000]
Mar 20 16:03:02 ptaylor-eeepc-laptop kernel: [15978.710731] mutter[31252]: segfault at 20c ip b63cae96 sp bfd94590 error 4 in i915_dri.so[b63aa000+64000]
Mar 20 16:03:13 ptaylor-eeepc-laptop kernel: [15990.117122] mutter[31804]: segfault at 20c ip b6456e96 sp bfb12940 error 4 in i915_dri.so[b6436000+64000]
Mar 20 16:03:27 ptaylor-eeepc-laptop kernel: [16003.914579] mutter[32258]: segfault at 20c ip b64f0e96 ...

Read more...

Bryce Harrington (bryce) wrote :

Philip, your log errors are about mutter crashing, not about a GPU lockup. Seems you have some unrelated bug.

Philip Taylor (scraliontis) wrote :

@Bryce i have moved it to 715096, as i talked to your colleague tjaalton , in #ubuntu-x lastnight, and he suggested it suits better in 715096, as it is a gpu hung error.
i made a mistake, and am sorry.
Philip

LinkedIn
------------

Bug,

I'd like to add you to my professional network on LinkedIn.

- Philip

Philip Pope
Administrator/Consultant at 3dge D8t8 Range USA
Charlotte, North Carolina Area

Confirm that you know Philip Pope:
https://www.linkedin.com/e/n5ijnj-ha49ao9o-1a/isd/9823583956/QJAgegXd/?hs=false&tok=0lvJE18AEsplw1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/n5ijnj-ha49ao9o-1a/GlEUY_GL-_pDDdPN3dg23kaLwDCBBY0xT0Csqb3/goo/720468%40bugs%2Elaunchpad%2Enet/20061/I3279053086_1/?hs=false&tok=24_5uCe8gsplw1

(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

To post a comment you must log in.