[i945gm] GPU lockup d0c69edf (PGTBL_ER: 0x00000001)

Bug #611951 reported by Benny Hult
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
Medium
xserver-xorg-video-intel (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

Running Lubuntu 10.10 and I was resuming from suspend with my Dell Latitude D420 and I got notified about crash.
This does not happen everytime.

---
Time: 1280551135 s 109870 us
PCI ID: 0x27a2
EIR: 0x00000010
  PGTBL_ER: 0x00000001
    Host Invalid GTT PTE
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0x7fffffc0
  ACTHD: 0x00000000
---

ProblemType: Crash
DistroRelease: Ubuntu 10.10
Package: xserver-xorg-video-intel 2:2.12.0-1ubuntu2
ProcVersionSignature: Ubuntu 2.6.35-12.17-generic 2.6.35-rc6
Uname: Linux 2.6.35-12-generic i686
Architecture: i386
Chipset: i945gm
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1280x800
 edid-base64: AP///////wAwZOJYMTQwNCsQAQOAGhB4CoeumVdPjCYiUFQAAAABAQEBAQEBAQEBAQEBAQEBKhwAqFAgHjAQMCIABaMQAAAYAAAAAAAAAAAAAAAAAAAAAAAAAAAA/gBHRjk1MaUxMjFFWEVEAAAA/gAnNkJKaYmp6wEBCiAgAOg=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Sat Jul 31 07:38:58 2010
DkmsStatus: Error: [Errno 2] No such file or directory
DumpSignature: d0c69edf
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
GdmLog: Error: command ['cat', '/var/log/gdm/:0.log'] failed with exit code 1: cat: /var/log/gdm/:0.log: No such file or directory
GdmLog1: Error: command ['cat', '/var/log/gdm/:0.log.1'] failed with exit code 1: cat: /var/log/gdm/:0.log.1: No such file or directory
GdmLog2: Error: command ['cat', '/var/log/gdm/:0.log.2'] failed with exit code 1: cat: /var/log/gdm/:0.log.2: No such file or directory
InstallationMedia: Lubuntu 10.10 "Maverick Meerkat" - i386 (20100702)
IntelGpuDump: Error: [Errno 2] No such file or directory
InterpreterPath: /usr/bin/python2.6
MachineType: Dell Inc. Latitude D420
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-12-generic root=UUID=1cf4549c-fcec-494d-8878-a1f00cc09b3f ro quiet splash
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

SourcePackage: xserver-xorg-video-intel
Title: [i945gm] GPU lockup d0c69edf
UserGroups:

dmi.bios.date: 02/02/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A06
dmi.board.name: 0TJ984
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA06:bd02/02/2008:svnDellInc.:pnLatitudeD420:pvr:rvnDellInc.:rn0TJ984:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude D420
dmi.sys.vendor: Dell Inc.
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 codename: maverick
 architecture: i686
 kernel: 2.6.35-12-generic

Revision history for this message
Benny Hult (bioterror) wrote :
Revision history for this message
Benny Hult (bioterror) wrote :

Seems like it might happen after every resume from suspend.

Geir Ove Myhr (gomyhr)
tags: added: 945gm
Revision history for this message
Geir Ove Myhr (gomyhr) wrote :

Running intel_error_decode on the attached i915_error_state.txt shows that there is a page table error and that the ringbuffer is completely empty:

Time: 1280551135 s 109870 us
PCI ID: 0x27a2
EIR: 0x00000010
  PGTBL_ER: 0x00000001
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0x7fffffc0
    busy: Secondary ring 3
    busy: Secondary ring 2
    busy: Secondary ring 1
    busy: Secondary ring 0
    busy: Primary ring 1
    busy: Primary ring 0
  ACTHD: 0x00000000
seqno: 0x00000000
ringbuffer at 0x0d4b8000:
0x0d4b8000: 0x00000000: MI_NOOP
0x0d4b8004: 0x00000000: MI_NOOP
0x0d4b8008: 0x00000000: MI_NOOP
....
0x0d4d7ff4: 0x00000000: MI_NOOP
0x0d4d7ff8: 0x00000000: MI_NOOP
0x0d4d7ffc: 0x00000000: MI_NOOP

summary: - [i945gm] GPU lockup d0c69edf
+ [i945gm] GPU lockup d0c69edf (PGTBL_ER: 0x00000001)
Geir Ove Myhr (gomyhr)
tags: added: resume suspend
Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

Originally reported for Ubuntu by Benny Hult at:
  https://bugs.launchpad.net/bugs/611951

Binary package hint: xserver-xorg-video-intel

Running Lubuntu 10.10 and I was resuming from suspend with my Dell Latitude D420 and I got notified about GPU error.

Seems like it might happen after every resume from suspend.

DistroRelease: Ubuntu 10.10
Package: xserver-xorg-video-intel 2:2.12.0-1ubuntu2
xserver-xorg 1:7.5+6ubuntu2
libgl1-mesa-glx 7.8.2-2ubuntu1
libdrm2 2.4.21-1ubuntu1
xserver-xorg-video-intel 2:2.12.0-1ubuntu2
ProcVersionSignature: Ubuntu 2.6.35-12.17-generic 2.6.35-rc6
Uname: Linux 2.6.35-12-generic i686
Architecture: i386
Chipset: i945gm
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1280x800
 edid-base64: AP///////wAwZOJYMTQwNCsQAQOAGhB4CoeumVdPjCYiUFQAAAABAQEBAQEBAQEBAQEBAQEBKhwAqFAgHjAQMCIABaMQAAAYAAAAAAAAAAAAAAAAAAAAAAAAAAAA/gBHRjk1MaUxMjFFWEVEAAAA/gAnNkJKaYmp6wEBCiAgAOg=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Sat Jul 31 07:38:58 2010
DumpSignature: d0c69edf
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
InstallationMedia: Lubuntu 10.10 "Maverick Meerkat" - i386 (20100702)
InterpreterPath: /usr/bin/python2.6
MachineType: Dell Inc. Latitude D420
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-12-generic root=UUID=1cf4549c-fcec-494d-8878-a1f00cc09b3f ro quiet splash
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

SourcePackage: xserver-xorg-video-intel
UserGroups:

dmi.bios.date: 02/02/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A06
dmi.board.name: 0TJ984
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA06:bd02/02/2008:svnDellInc.:pnLatitudeD420:pvr:rvnDellInc.:rn0TJ984:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude D420
dmi.sys.vendor: Dell Inc.
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 codename: maverick
 architecture: i686
 kernel: 2.6.35-12-generic

Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

Created an attachment (id=37505)
i915_error_state

Running intel_error_decode on the attached i915_error_state.txt shows that there is a page table error and that the ringbuffer is completely empty:

Time: 1280551135 s 109870 us
PCI ID: 0x27a2
EIR: 0x00000010
  PGTBL_ER: 0x00000001
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0x7fffffc0
    busy: Secondary ring 3
    busy: Secondary ring 2
    busy: Secondary ring 1
    busy: Secondary ring 0
    busy: Primary ring 1
    busy: Primary ring 0
  ACTHD: 0x00000000
seqno: 0x00000000
ringbuffer at 0x0d4b8000:
0x0d4b8000: 0x00000000: MI_NOOP
0x0d4b8004: 0x00000000: MI_NOOP
0x0d4b8008: 0x00000000: MI_NOOP
....
0x0d4d7ff4: 0x00000000: MI_NOOP
0x0d4d7ff8: 0x00000000: MI_NOOP
0x0d4d7ffc: 0x00000000: MI_NOOP

Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

Created an attachment (id=37506)
dmesg from boot (GPU error not shown here)

Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

Created an attachment (id=37507)
dmesg showing GPU error on suspend

Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

Created an attachment (id=37508)
Xorg.0.log

Revision history for this message
Geir Ove Myhr (gomyhr) wrote :

I have forwarded this bug the intel developers at https://bugs.freedesktop.org/show_bug.cgi?id=29345 . Please register at bugs.freedesktop.org, add yourself to the CC: field of that bug report, and respond to any requests made by the intel developers. If you need help with fulfilling any of these requests, you may ask for help here.

There are a couple of things you may do to enhance the upstream bug report:

* Verify that the problem exists with the very latest and greatest xorg/libdrm etc. You can do this by using the xorg-edgers PPA at https://launchpad.net/~xorg-edgers/+archive/ppa .
* Verify that the problem exists with the very latest and greatest kernel. The current Maverick kernel is pretty up-to-date, but it is comforting for the developers to know that the problem exists in a non-distribution-specific kernel. You can test both the latest drm-intel-next kernel and the latest 2.6.35-rcX (now rc6, and the next is probably the 2.6.35 release itself). Look at the mainline builds page at https://wiki.ubuntu.com/Kernel/MainlineBuilds .
* Use drm.debug=0x02 kernel parameter to get a more verbose dmesg output. It may include more details about what went wrong during suspend. Attach resulting dmesg output before and after suspend.
* Grab the output of `sudo intel_reg_dumper` before suspend and after resume. There may be important differences there (but probably not).

Instead of relying on apport to tell you that there is a GPU error, you can see it in the dmesg output:

[17706.150645] render error detected, EIR: 0x00000010
[17706.150645] page table error
[17706.150645] PGTBL_ER: 0x00000001
[17706.150645] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking
[17706.150703] render error detected, EIR: 0x00000010
[17706.150703] page table error
[17706.150703] PGTBL_ER: 0x00000001

Actually, it would be nice if you could let the computer be suspended for a few minutes, so that is easier to distinguish what happens on suspend and what happens on the following resume.

Geir Ove Myhr (gomyhr)
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Revision history for this message
In , Chris Wilson (ickle) wrote :

Page table error on resume, presuming fixed with the following commit unless the bug can be verified on the current kernel.

commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27
Author: Chris Wilson <email address hidden>
Date: Thu May 27 13:18:18 2010 +0100

    drm/i915: Rebind bo if currently bound with incorrect alignment.

    Whilst pinning the buffer, check that that its current alignment
    matches the requested alignment. If it does not, rebind.

    This should clear up any final render errors whilst resuming,
    for reference:

      Bug 27070 - [i915] Page table errors with empty ringbuffer
      https://bugs.freedesktop.org/show_bug.cgi?id=27070

      Bug 15502 - render error detected, EIR: 0x00000010
      https://bugzilla.kernel.org/show_bug.cgi?id=15502

      Bug 13844 - i915 error: "render error detected"
      https://bugzilla.kernel.org/show_bug.cgi?id=13844

Revision history for this message
In , Geir Ove Myhr (gomyhr) wrote :

(In reply to comment #5)
> Page table error on resume, presuming fixed with the following commit unless
> the bug can be verified on the current kernel.
>
> commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27

This bug was reported with a 2.6.35-rc6 based kernel which included that commit. The original reporter should show up here soon, and I hereby ask him to verify with the 2.6.35 mainline build to confirm this is not Ubuntu specific and reopen the bug once verified.

Revision history for this message
jerrylamos (jerrylamos) wrote :

Looks like same error on i845G,
[i845gm] GPU lockup d0c69edf
with
Linux version 2.6.35-14-generic (buildd@rothera) (gcc version 4.4.5 20100728 (prerelease) (Ubuntu/Linaro 4.4.4-8ubuntu1) ) #20-Ubuntu SMP Fri Aug 6 21:49:44 UTC 2010
IntelDriver Version: 2:2.12.0-1ubuntu2

Some dump info attached.

Jerry

Revision history for this message
jerrylamos (jerrylamos) wrote :

On post #5, KMS is up and running. the linux boot line just says "quiet" and defaults to the Maverick A3 intel driver.

Jerry

Robert Hooker (sarvatt)
description: updated
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
Rolf Leggewie (r0lf) wrote :
Changed in xserver-xorg-video-intel:
importance: Medium → Unknown
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
Revision history for this message
bugbot (bugbot) wrote :

Hey Benny,

Thanks for testing maverick during its development period. Unfortunately it looks like this bug report didn't get attention during the maverick development period. But I see there's not been more comments on the bug since the release, which makes me wonder if this is still an issue for you?

If you've not seen this issue since maverick's release yourself, it may have been solved by kernel or X or other updates that occurred late in the release; if so, would you mind please closing the bug for us? Go to the URL mentioned in this bug report, click the yellow icon(s) in the status column and set to 'Fix Released'.

If you no longer have the hardware needed to reproduce the problem, or otherwise feel the bug no longer needs tracked in Launchpad, you can set the status to 'Invalid'.

If you are the original reporter and still have this issue, just reply to this email saying so. (Or set the bug status to Confirmed.) If you are able to re-test this against 11.04 Natty Narwhal (our current development focus) and find the issue still affects Natty, please also run 'apport-collect <bug-number>' while running natty, which will add fresh logs and debug data, and flag it for the Ubuntu-X development team to look at.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

Upstream believes this was fixed with commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27

No response from original reporter, so assuming it to be resolved now.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.