[i965gm] GPU lockup - Invalid GTT entry during Display B Fetch

Bug #686388 reported by pschonmann on 2010-12-07
102
This bug affects 16 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
Medium
linux (Ubuntu)
Medium
Unassigned
Natty
Medium
Unassigned
xserver-xorg-video-intel (Ubuntu)
Medium
Unassigned
Natty
Medium
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

Crashed, when i was reporting https://bugs.launchpad.net/ubuntu/+source/apt-xapian-index/+bug/686386

---
Time: 1291713055 s 678993 us
PCI ID: 0x2a02
EIR: 0x00000010
  PGTBL_ER: 0x00000100
    Invalid GTT entry during Display B Fetch
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0xffe5fafe
    busy: Projection and LOD
    busy: Bypass FIFO
    busy: Color calculator
  ACTHD: 0x00000000
  INSTPS: 0x00100000
  INSTDONE1: 0x000fffff
---

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.13.901-2ubuntu1
ProcVersionSignature: Ubuntu 2.6.37-7.19-generic 2.6.37-rc3
Uname: Linux 2.6.37-7-generic x86_64
Architecture: amd64
Chipset: i965gm
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1440x900 1440x900 1024x768 800x600 640x480
 edid-base64: AP///////wAwrjNAAAAAAAAPAQOAHhN46s11kVVPiyYhUFQhCAABAQEBAQEBAQEBAQEBAQEBMiagQFGEGjAwIDYAL74QAAAZ1R+gQFGEGjAwIDYAL74QAAAZAAAADwCQCjKQCigUAQBMo1dEAAAA/gBMVE4xNDFXRC1MMDUKAKY=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Tue Dec 7 10:10:58 2010
DumpSignature: 94ada031 (EIR: 0x00000010)
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
InterpreterPath: /usr/bin/python2.6
MachineType: LENOVO 7661CH8
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.37-7-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.37-7-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash
SourcePackage: xserver-xorg-video-intel
Title: [i965gm] GPU lockup 94ada031 (EIR: 0x00000010)
UserGroups:

dmi.bios.date: 04/08/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC7WW (2.27 )
dmi.board.name: 7661CH8
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: BB70301820
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC7WW(2.27):bd04/08/2010:svnLENOVO:pn7661CH8:pvrThinkPadT61:rvnLENOVO:rn7661CH8:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7661CH8
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 codename: natty
 architecture: x86_64
 kernel: 2.6.37-7-generic

---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ps 1611 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfe020000 irq 49'
   Mixer name : 'Analog Devices AD1984'
   Components : 'HDA:11d41984,17aa20d7,00100400'
   Controls : 31
   Simple ctrls : 19
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 7KHT24WW-1.08'
   Mixer name : 'ThinkPad EC 7KHT24WW-1.08'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
CompisitorRunning: None
CompizPlugins: No value set for `/apps/compiz-1/general/allscreens/options/active_plugins'
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1440x900 1440x900 1024x768 800x600 640x480
 edid-base64: AP///////wAwrjNAAAAAAAAPAQOAHhN46s11kVVPiyYhUFQhCAABAQEBAQEBAQEBAQEBAQEBMiagQFGEGjAwIDYAL74QAAAZ1R+gQFGEGjAwIDYAL74QAAAZAAAADwCQCjKQCigUAQBMo1dEAAAA/gBMVE4xNDFXRC1MMDUKAKY=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DistUpgraded: Yes, recently upgraded Log time: 2010-11-15 10:52:54.887470
DistroCodename: natty
DistroRelease: Ubuntu 11.04
DistroVariant: ubuntu
Frequency: Once every few days.
GraphicsCard:
 Subsystem: Lenovo T61 [17aa:20b5]
   Subsystem: Lenovo T61 [17aa:20b5]
HibernationDevice: RESUME=UUID=a59c9b7d-512f-4dbb-a174-204fc568786d
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
InstallationMedia_: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MachineType: LENOVO 7661CH8
Package: xserver-xorg-video-intel 2:2.14.0+git20110124.5baa63c6-0ubuntu0sarvatt
PackageArchitecture: amd64
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 LANGUAGE=en
 LANG=cs_CZ.utf8
 LC_MESSAGES=en_AG.utf8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.37-12-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash vt.handoff=7
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.37-12-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 2.6.37-12.26-generic 2.6.37
ProcVersionSignature_: Ubuntu 2.6.37-12.26-generic 2.6.37
Regression: No
RelatedPackageVersions:
 linux-restricted-modules-2.6.37-12-generic N/A
 linux-backports-modules-2.6.37-12-generic N/A
 linux-firmware 1.46
Renderer: Hardware acceleration
Reproducible: No
Tags: natty kernel-power suspend resume needs-upstream-testing natty natty ubuntu
Uname: Linux 2.6.37-12-generic x86_64
UnitySupportTest:

UnreportableReason: This is not a genuine Ubuntu package
UserAsoundrc:
 pcm."BT sluchátka" {
         type bluetooth
         device 00:1E:55:23:81:E9
         profile hifi
 }
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 04/08/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC7WW (2.27 )
dmi.board.name: 7661CH8
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: BB70301820
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC7WW(2.27):bd04/08/2010:svnLENOVO:pn7661CH8:pvrThinkPadT61:rvnLENOVO:rn7661CH8:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7661CH8
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO
version.libdrm2: libdrm2 2.4.23+git20110119.550fe2ca-0ubuntu0sarvatt
version.libgl1-mesa-glx: libgl1-mesa-glx 7.11.0+git20110119.a5da4acb-0ubuntu0sarvatt
version.xserver-xorg: xserver-xorg 1:7.5+6ubuntu8
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.13.99+git20110124.fadee040-0ubuntu0sarvatt
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0+git20110124.5baa63c6-0ubuntu0sarvatt
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110124.38e8809b-0ubuntu0sarvatt

pschonmann (pschonmann) wrote :
Robert Hooker (sarvatt) on 2010-12-07
description: updated
Bryce Harrington (bryce) wrote :

Sarvatt says the error here is "Invalid GTT entry during Display B Fetch" but I'm not seeing how/where he found it.

Anyway, this looks like a unique new bug.

@pschonmann, can you elaborate more about the bug? Did this appear only the one time or have you had other lockups? Any recent changes to your system hardware/software-wise? Anything unusual you were doing in the 30 min or so prior to seeing the lockup? Do you have reason to believe the bug might be related in some fashion to xapian or to the failed update? Any other info that might be helpful in diagnosing the issue?

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
pschonmann (pschonmann) wrote :

This bug appear only one time... yet
No changes in HW / SW, just updating to latest nattys packages
I did not make any unusual things, just run update manager and update packages.
-
Do you have reason to believe the bug might be related in some fashion to xapian or to the failed update? : Maybe, because bug appear after bug which was related to update manager. But ALL packages was updated succesfully and no other updates were waiting after recheck.

Bryce Harrington (bryce) wrote :

Hrm, that may make this too difficult to debug. Well, please follow up if it occurs again or if you have ideas on what might have triggered it. Meanwhile we'll see if anyone else sees it.

summary: - [i965gm] GPU lockup 94ada031 (EIR: 0x00000010)
+ [i965gm] GPU lockup - Invalid GTT entry during Display B Fetch

Forwarding this bug from Ubuntu reporter pschonmann:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/686388

[Problem]
One-time crash occurred while using firefox. Steps to reproduce are unknown.

[i965gm] GPU lockup - Invalid GTT entry during Display B Fetch

[Original Description]
Crashed, when i was reporting https://bugs.launchpad.net/ubuntu/+source/apt-xapian-index/+bug/686386

This bug appear only one time... yet
No changes in HW / SW, just updating to latest nattys packages
I did not make any unusual things, just run update manager and update packages.

---
Time: 1291713055 s 678993 us
PCI ID: 0x2a02
EIR: 0x00000010
  PGTBL_ER: 0x00000100
    Invalid GTT entry during Display B Fetch
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0xffe5fafe
    busy: Projection and LOD
    busy: Bypass FIFO
    busy: Color calculator
  ACTHD: 0x00000000
  INSTPS: 0x00100000
  INSTDONE1: 0x000fffff
---

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.13.901-2ubuntu1
ProcVersionSignature: Ubuntu 2.6.37-7.19-generic 2.6.37-rc3
Uname: Linux 2.6.37-7-generic x86_64
Architecture: amd64
Chipset: i965gm
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1440x900 1440x900 1024x768 800x600 640x480
 edid-base64: AP///////wAwrjNAAAAAAAAPAQOAHhN46s11kVVPiyYhUFQhCAABAQEBAQEBAQEBAQEBAQEBMiagQFGEGjAwIDYAL74QAAAZ1R+gQFGEGjAwIDYAL74QAAAZAAAADwCQCjKQCigUAQBMo1dEAAAA/gBMVE4xNDFXRC1MMDUKAKY=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Tue Dec 7 10:10:58 2010
DumpSignature: 94ada031 (EIR: 0x00000010)
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
InterpreterPath: /usr/bin/python2.6
MachineType: LENOVO 7661CH8
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.37-7-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.37-7-generic root=UUID=a3d37e1f-bbf7-4766-b8cb-22b445b6e7e0 ro quiet splash
SourcePackage: xserver-xorg-video-intel
Title: [i965gm] GPU lockup 94ada031 (EIR: 0x00000010)
UserGroups:

dmi.bios.date: 04/08/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC7WW (2.27 )
dmi.board.name: 7661CH8
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: BB70301820
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC7WW(2.27):bd04/08/2010:svnLENOVO:pn7661CH8:pvrThinkPadT61:rvnLENOVO:rn7661CH8:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7661CH8
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO
glxinfo: Error: [Errno 2] No such file or directory
system:
  codename: natty
 architecture: x86_64
 kernel: 2.6.37-7-generic

Created attachment 41117
BootDmesg.txt

Created attachment 41118
CurrentDmesg.txt

Created attachment 41119
i915_error_state.txt

Bryce Harrington (bryce) wrote :

pschonmann - I've forwarded this bug upstream to http://bugs.freedesktop.org/show_bug.cgi?id=32396 - please subscribe yourself to this bug, in case they need further information or wish you to test something. Thanks ahead of time!

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium

That PGTBL_ER only makes sense in conjunction with a mode change. I can't see the actual crash dmesg to confirm that it was the only error detected along with the crash running firefox.

Bryce Harrington (bryce) wrote :

Hi pschonmann, have you seen this gpu lockup issue any other times?

Upstream indicates the error message is likely related to resolution mode changes. Does that remind you of anything you were doing or that happened when you encountered the bug originally? Can you also try changing your resolution or other monitor settings a bunch, and see if you can trigger the lockup again?

pschonmann (pschonmann) wrote :

Hi Bryce,

Ive never seen this error from bugreport. I tried to change resolution with latest packages and seems work correctly. No crash occured.
When error occured i wasnt changing resolution. Im happy with my 1440x900

Same user saw similar gpu lockup (same PGTBL error code at least) during a resume from suspend.

https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/694244

Bryce Harrington (bryce) wrote :

pschonmann, ok thanks. Given this is the same hardware as for bug #694244 I'll dupe them together. Even though the GPU dumps differ, it's the same hardware and the error code itself is the same.

Bryce Harrington (bryce) wrote :

pschonmann, one of the problems with the bug report is that apport is collecting your dmesg *after* you've rebooted, which is pretty unuseful; upstream wants the dmesg from when the system crashed. (Actually it collects both before and after but it overwrites the data on the second call.)

I've spoken with pitti about this just now, and he's updating apport to preserve the original data. So, please update to current natty tomorrow, and then the next time you see this bug, let apport report it, and hopefully then it'll include the proper data upstream wants.

Thanks ahead of time!

pschonmann (pschonmann) wrote :

Its pity that old dmesg logs gone. ( rotate 4 and weekly ) I found that option today. I edited to store more weeks than 4.
I think that would be usefull to set when using not released distro.

Bryce Harrington (bryce) wrote :

It's a good idea to retain dmesg log files longer on the development version. That could be filed as a wishlist bug if you'd like. On the other hand, in the development version of Ubuntu we get a new kernel more often than once a month, and *usually* will care only about bugs and dmesgs for the latest kernel.

Created attachment 42174
IntelGpuDump.txt

Here's the GPU dump from the last freeze. This includes several calls to XY_COLOR_BLT() - we've had several other bug reports with this in their batchbuffers, although the gpu dumps differ from bug to bug.

Bugzilla won't allow the raw file to be attached, so I've attached it gzipped and here's a link:

https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/694244/+attachment/1775717/+files/IntelGpuDump.txt

Bryce Harrington (bryce) wrote :

Intel has released a Q4 drop of the -intel driver. I think the next step for this bug is to re-test against that package, and then if it still occurs, bring this issue to the attention of the Intel devs.

(In reply to comment #7)
> Here's the GPU dump from the last freeze. This includes several calls to
> XY_COLOR_BLT() - we've had several other bug reports with this in their
> batchbuffers, although the gpu dumps differ from bug to bug.

The contents of the batch buffers are more or less irrelevant to the reported error, since we only manipulate the display surfaces directly from the kernel. The latest i915_error_state is more interesting with regards to capturing these errors, since it includes the display settings and the pinned buffers.

Pinning down the timing to that of a mode change and confirming that on 2.6.38-rc1, which has extra paranoia with regards the timing of the display surface removal and the improved error state, would be most useful.

Created attachment 42241
i915_error_state.txt

No chance, kernel team says we have 2.6.38-rc1 but it's too horridly borked to foist on users; the kernel guys say it doesn't even boot.

But here's the i915_error_state.txt you asked for.

pschonmann (pschonmann) wrote :
Download full text (20.2 KiB)

Hi, i have edgers PPA but still occurs, maybe more often. But i discovered new thing. When im pressing some keys i hear sound. System seems not freezed, but only screen is blank.

Here are logs from kernel when i opened lid [ only estimating ]

Jan 21 06:59:48 localhost kernel: [ 9217.496011] Freezing user space processes ... (elapsed 0.01 seconds) done.
Jan 21 06:59:48 localhost kernel: [ 9217.515976] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
Jan 21 06:59:48 localhost kernel: [ 9217.530097] PM: Entering mem sleep
Jan 21 06:59:48 localhost kernel: [ 9217.530129] Suspending console(s) (use no_console_suspend to debug)
Jan 21 06:59:48 localhost kernel: [ 9217.685918] PM: suspend of drv:psmouse dev:serio2 complete after 155.335 msecs
Jan 21 06:59:48 localhost kernel: [ 9217.690304] sd 2:0:0:0: [sda] Synchronizing SCSI cache
Jan 21 06:59:48 localhost kernel: [ 9217.690375] sd 2:0:0:0: [sda] Stopping disk
Jan 21 06:59:48 localhost kernel: [ 9217.800771] ata_piix 0000:00:1f.1: PCI INT C disabled
Jan 21 06:59:48 localhost kernel: [ 9217.800802] ehci_hcd 0000:00:1d.7: PCI INT D disabled
Jan 21 06:59:48 localhost kernel: [ 9217.800822] uhci_hcd 0000:00:1d.2: PCI INT C disabled
Jan 21 06:59:48 localhost kernel: [ 9217.800837] uhci_hcd 0000:00:1d.1: PCI INT B disabled
Jan 21 06:59:48 localhost kernel: [ 9217.800858] uhci_hcd 0000:00:1d.0: PCI INT A disabled
Jan 21 06:59:48 localhost kernel: [ 9217.800915] pciehp 0000:00:1c.3:pcie04: pciehp_suspend ENTRY
Jan 21 06:59:48 localhost kernel: [ 9217.801071] ehci_hcd 0000:00:1a.7: PCI INT C disabled
Jan 21 06:59:48 localhost kernel: [ 9217.801094] uhci_hcd 0000:00:1a.1: PCI INT B disabled
Jan 21 06:59:48 localhost kernel: [ 9217.801112] uhci_hcd 0000:00:1a.0: PCI INT A disabled
Jan 21 06:59:48 localhost kernel: [ 9217.802712] e1000e 0000:00:19.0: PCI INT A disabled
Jan 21 06:59:48 localhost kernel: [ 9217.802724] e1000e 0000:00:19.0: PME# enabled
Jan 21 06:59:48 localhost kernel: [ 9217.802734] e1000e 0000:00:19.0: wake-up capability enabled by ACPI
Jan 21 06:59:48 localhost kernel: [ 9217.820056] i915 0000:00:02.0: power state changed by ACPI to D3
Jan 21 06:59:48 localhost kernel: [ 9217.820175] ACPI handle has no context!
Jan 21 06:59:48 localhost kernel: [ 9217.910475] HDA Intel 0000:00:1b.0: PCI INT B disabled
Jan 21 06:59:48 localhost kernel: [ 9217.930043] PM: suspend of drv:HDA Intel dev:0000:00:1b.0 complete after 129.017 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.898038] PM: suspend of drv:sd dev:2:0:0:0 complete after 2207.739 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.898057] PM: suspend of drv:scsi dev:target2:0:0 complete after 2207.757 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.898081] PM: suspend of drv:scsi dev:host2 complete after 2207.666 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.910044] PM: suspend of drv:ahci dev:0000:00:1f.2 complete after 2109.442 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.910066] PM: suspend of drv: dev:pci0000:00 complete after 2107.209 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.910091] PM: suspend of devices complete after 2379.545 msecs
Jan 21 06:59:48 localhost kernel: [ 9219.910095] PM: suspend devices took 2.380 s...

Changed in xserver-xorg-video-intel:
status: Unknown → Confirmed
Bryce Harrington (bryce) wrote :

Thanks for testing xorg-edgers pschonmann, was just going to suggest doing that as the next step. :-)

Could you also run the command 'apport-collect 686388' while running xorg-edgers? That'll ensure all the logs are showing the right versions and so on, so we can forward the bug upstream. Thanks ahead of time.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Incomplete

apport information

tags: added: apport-collected kernel-power needs-upstream-testing resume suspend ubuntu
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Bryce Harrington (bryce) on 2011-01-25
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Triaged
tags: added: kj-triage
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
Bryce Harrington (bryce) wrote :

Hi pschonmann,

I have a suspicion this might be the same as bug #702090, which seems to be an interaction with the vesafb.

Could you please update to latest natty, verify you can still reproduce the freeze, and then after that test again with vesafb disabled? Do this by using this kernel command line parameter:

  vesafb.anything=1

Bryce Harrington (bryce) on 2011-02-07
Changed in xserver-xorg-video-intel (Ubuntu):
importance: Medium → High
status: Triaged → Incomplete
pschonmann (pschonmann) wrote :

Hi Bryce,

Im updating to latest regulary. I purged xorgedgers ppa in past days and now seems to be ok. But its hard to catch this bug. Its irregular bug. But im still in contact with him.

Youre writing about vesafb.anything=1 but i absoultely have no idea how to do that. Can you teach me, or link how can i do that. Thanks.

Bryce Harrington (bryce) wrote :

Go into grub by holding down left shift key during boot, then append it to the kernel command line.

If you need more guidance on setting kernel parameters, http://askubuntu.com might be able to give better directions.

Brian Murray (brian-murray) wrote :

This wiki page has details about how to add a kernel boot parameter for testing.

https://wiki.ubuntu.com/Kernel/KernelBootParameters

Bryce Harrington (bryce) on 2011-02-11
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
status: New → Incomplete

Created attachment 44991
Disable outputs before KMS takeover

If the reported hang was occurring early in the boot process, then the attached might be an answer. But afaics, this hang is much later...

Hmm, there is a second patch required to fix an oops. In drm-intel-staging:

commit ea1167d6601f370f5d7e425eb0b3c7577edd02cd
Author: Chris Wilson <email address hidden>
Date: Tue Mar 29 13:19:09 2011 +0100

    drm/i915: Move the irq wait queue initialisation into the ring init

    Required so that we don't obliterate the queue if initialising the
    rings after the global IRQ handler is installed.

    Signed-off-by: Chris Wilson <email address hidden>

commit f8acdf5aa142926961e1f7ddb9e86490c50f8e6a
Author: Chris Wilson <email address hidden>
Date: Tue Mar 29 10:40:27 2011 +0100

    drm/i915: Disable all outputs early, before KMS takeover

    If the outputs are active and continuing to access the GATT when we
    teardown the PTEs, then there is a potential for us to hang the GPU.
    The hang tends to be a PGTBL_ER with either an invalid host access or
    an invalid display plane fetch.

    Reported-by: Pekka Enberg <email address hidden>
    Signed-off-by: Chris Wilson <email address hidden>

*** Bug 35976 has been marked as a duplicate of this bug. ***

*** Bug 35975 has been marked as a duplicate of this bug. ***

*** Bug 35974 has been marked as a duplicate of this bug. ***

Those two patches have been reported by one user to have fixed the issue for him, but I need a few more testers since they seem to foul up a MacBook (but then there are more than one issue at play with MacBooks...)

Bryce Harrington (bryce) on 2011-04-05
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
Bryce Harrington (bryce) wrote :

Upstream thinks the following two kernel patches (currently on drm-intel-staging) might resolve the issue, however they require testing before they'll commit to them:

commit e6793fa5504ac5c09a8f22f907c2b5f4543af7d9
Author: Chris Wilson <email address hidden>
Date: Tue Mar 29 10:40:27 2011 +0100

    drm/i915: Disable all outputs early, before KMS takeover

    If the outputs are active and continuing to access the GATT when we
    teardown the PTEs, then there is a potential for us to hang the GPU.
    The hang tends to be a PGTBL_ER with either an invalid host access or
    an invalid display plane fetch.

    Reported-by: Pekka Enberg <email address hidden>
    Signed-off-by: Chris Wilson <email address hidden>
    Tested-by: Daniel Vetter <email address hidden> (855GM)

commit b023d74ad16336ea07fb237b52899df6df63e4b2
Author: Chris Wilson <email address hidden>
Date: Tue Mar 29 13:19:09 2011 +0100

    drm/i915: Move the irq wait queue initialisation into the ring init

    Required so that we don't obliterate the queue if initialising the
    rings after the global IRQ handler is installed.

    Signed-off-by: Chris Wilson <email address hidden>

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete

Thanks Chris, the early output disablement especially sounds promising.

Our kernel team does daily builds of drm-intel-next and drm-next, but not drm-intel-staging, so may take a while before we can produce something for the reporters to test (and I doubt the reporters would be patching their kernel manually although who knows).

I know, catch 22. They can't be accepted into -fixes unless we know they fix the bug. And whilst they are in -staging, only the foolhardy will try them.

Tim Gardner (timg-tpi) on 2011-04-05
Changed in linux (Ubuntu Natty):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Tim Gardner (timg-tpi) wrote :

Please install the Natty test kernel 2.6.32-32.61~lp719446.1 from https://launchpad.net/~timg-tpi/+archive/ppa

echo "deb http://ppa.launchpad.net/timg-tpi/ppa/ubuntu natty main"|sudo tee /etc/apt/sources.list.d/timg-ppa.list
sudo apt-get update
sudo apt-get -u dist-upgrade

One of our kernel engineers was kind enough to do a quick package of the patches for user testing. This is a cherrypick of the natty kernel with these two patches, not a package of drm-intel-staging:

Please install the Natty test kernel 2.6.32-32.61~lp719446.1 from https://launchpad.net/~timg-tpi/+archive/ppa

echo "deb http://ppa.launchpad.net/timg-tpi/ppa/ubuntu natty main"|sudo tee /etc/apt/sources.list.d/timg-ppa.list
sudo apt-get update
sudo apt-get -u dist-upgrade

Hopefully one of the reporters of this bug will test the kernel and give feedback.

Careful. Above commands caused my display to not work and I had to reboot into an older kernel selected in grub to get things working again. It does NOT fix the issue for me. I'm on a ~2007 Macbook. Also, the kernel you mention in your post does not exist in your repository!

I should be more descriptive - the display was off and showed no graphics whatsoever. I can't tell if the boot process succeeded or failed and had to do a hard-reboot.

Ah I should have mentioned I did install 2.6.38-8-generic_2.6.38-8.42~lp686388 for i386 and the associated linux-image-generic, linux-generic, linux-libc-dev. That's the one that caused the issue. I tried booting both with an external monitor attached and without an external monitor. Bug #749784 is apparently my dupe of this one so you can see my system information in that one. Let me know what else I can do to help.

There has been one tester of this patched kernel so far, Daniel G. Taylor, who writes:

"""
Above commands caused my display to not work and I had to reboot into an older kernel selected in grub to get things working again. It does NOT fix the issue for me. I'm on a ~2007 Macbook. The display was off and showed no graphics whatsoever. I can't tell if the boot process succeeded or failed and had to do a hard-reboot.

I installed 2.6.38-8-generic_2.6.38-8.42~lp686388 for i386 and the associated linux-image-generic, linux-generic, linux-libc-dev. That's the one that caused the issue.

I tried booting both with an external monitor attached and without an external monitor. Bug LP #749784 is apparently my dupe of this one so you can see my system information in that one. Let me know what else I can do to help.
"""

Bryce Harrington (bryce) wrote :

Thanks for testing. It sounds like upstream is aware of there being issues on macbooks associated with these two patches.

Has anyone on non-macbook hardware had a chance to try the test kernels? Would be nice to see if the patches have a positive effect on any hardware...

MacBooks seem to enter the kernel with a PGTBL_ER already pending. You need the v2 patch to survive, but as stated it looks like MacBooks has a separate issue.

Bryce Harrington (bryce) wrote :

Anyone still having this lockup? If not guess we can just close it.

If anyone does still experience it, please test the kernel posted in the comments above.

Changed in xserver-xorg-video-intel (Ubuntu Natty):
status: Incomplete → New
status: New → Incomplete

Has the Macbook issue been resolved? If so I can test again later tonight.

Bryce Harrington (bryce) wrote :

@Daniel, no the macbook issue remains afaik. We need testing on a non-macbook system that experiences this issue, in order to move forward with the fix. Upstream believes there's other things at play that affect macbook, beyond this issue, so they want to handle that case separately. *Shrug*

Changed in xserver-xorg-video-intel (Ubuntu Natty):
status: Incomplete → New
status: New → Incomplete
Jeremy Foshee (jeremyfoshee) wrote :

Marked Natty task incomplete and milestone to natty-updates. We are still waiting on someone to test on non-Macbook hardware so we can verify the patch for that use case.

~JFo

Changed in linux (Ubuntu Natty):
importance: Undecided → Medium
status: In Progress → Incomplete
milestone: none → natty-updates
Martin Pitt (pitti) wrote :

Seems this is a bug on the kernel side, closing the userspace driver task then.

Changed in xserver-xorg-video-intel (Ubuntu Natty):
status: Incomplete → Invalid
importance: High → Medium
Changed in linux (Ubuntu):
status: Incomplete → Opinion
Bryce Harrington (bryce) wrote :

(Why 'Opinion' status in particular?)

Tim Gardner (timg-tpi) on 2011-10-06
Changed in linux (Ubuntu Natty):
assignee: Tim Gardner (timg-tpi) → nobody
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
assignee: Tim Gardner (timg-tpi) → nobody
status: Opinion → Invalid
milestone: natty-updates → none

commit c7bd4c25650704d4d065eb4ce2a122d2a80ce804
Author: Chris Wilson <email address hidden>
Date: Tue Apr 24 16:36:50 2012 +0100

    drm/i915: Remove too early plane enable on pre-PCH hardware

    Enabling the plane before we have assigned valid address means that it
    will access random PTE (often with conflicting memory types) and cause
    GPU lockups. However, enabling the plane too early appears to workaround
    a number of bugs in our modesetting code.

    Cc: Franz Melchior <email address hidden>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=39947
    References: https://bugs.freedesktop.org/show_bug.cgi?id=41091
    References: https://bugs.freedesktop.org/show_bug.cgi?id=49041
    Signed-off-by: Chris Wilson <email address hidden>
    Acked-by: Jesse Barnes <email address hidden>
    Signed-off-by: Daniel Vetter <email address hidden>

Changed in xserver-xorg-video-intel:
status: Confirmed → Fix Released
Bryce Harrington (bryce) wrote :

The patch "drm/i915: Remove too early plane enable on pre-PCH hardware" looks ready for review by the kernel team for backporting.

This bug is open only against natty, however the upstream patch is recent, which makes me wonder if this would be applicable to oneiric/precise as well?

tags: added: kernel-handoff-graphics
dino99 (9d9) on 2013-05-18
Changed in linux (Ubuntu Natty):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.