[i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen (dpms?)

Bug #768184 reported by Stuart Langridge
326
This bug affects 54 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Confirmed
High
linux (Ubuntu)
Fix Released
Medium
Unassigned
xserver-xorg-video-intel (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

Crash which required reboot. The crash itself is described in https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/768176 and this is after I persuaded apport-gpu-error-intel.py to run.

My screen went entirely black (both laptop screen and second monitor). Switching to a VC did not show anything on screen. At first I could still hear sounds from running applications, but eventually (after ~10 seconds) they stopped. I had to powercycle the machine to get control back. The "system problem detected" apport dialog offered to let me file a bug, but then I got another crash dialog saying "apport-gpu-error-intel.py closed unexpectedly".

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-4ubuntu7
ProcVersionSignature: Ubuntu 2.6.38-8.42-generic-pae 2.6.38.2
Uname: Linux 2.6.38-8-generic-pae i686
Architecture: i386
Chipset: i965gm
CompositorRunning: compiz
DRM.card0.HDMI.A.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1280x800
 edid-base64: AP///////wAwZAYjMjQ5NTISAQOAHRJ4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBKhwAqFAgHjAQMCIAH7QQAAAYAAAAAAAAAAAAAAAAAAAAAAAAAAAA/gBSUDc3NKMxMzNFV0REAAAA/gAIDBAUKFB/2AEBCiAgAL4=
DRM.card0.VGA.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1680x1050 1280x1024 1280x1024 1280x960 1152x864 1024x768 1024x768 1024x768 832x624 800x600 800x600 800x600 800x600 640x480 640x480 640x480 640x480 720x400
 edid-base64: AP///////wBMLdIDMjJBSCMTAQMOMB54KtxVo1lIniQRUFS/74CzAIGAgUBxTwEBAQEBAQEBITmQMGIaJ0BosDYA2igRAAAcAAAA/QA4Sx5REAAKICAgICAgAAAA/ABTeW5jTWFzdGVyCiAgAAAA/wBIOUZTODM5NDg1CiAgAAI=
Date: Thu Apr 21 10:25:20 2011
DistUpgraded: Log time: 2011-01-18 17:25:59.814253
DistroCodename: natty
DistroVariant: ubuntu
DuplicateSignature: (ESR: 0x00000001 IPEHR: 0x01800020)
ExecutablePath: /home/aquarius/apport-gpu-error-intel.py
GraphicsCard:
 Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) [8086:2a02] (rev 0c) (prog-if 00 [VGA controller])
   Subsystem: Dell Device [1028:0209]
   Subsystem: Dell Device [1028:0209]
InterpreterPath: /usr/bin/python2.7
MachineType: Dell Inc. XPS M1330
ProcCmdline: python apport-gpu-error-intel.py
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, user)
 LC_MESSAGES=en_GB.utf8
 LANG=en_US.UTF-8
 LANGUAGE=en_GB:en
ProcKernelCmdLine: root=UUID=b572742c-deea-43ec-92d3-b1d1e6b6802f ro quiet splash
ProcKernelCmdLine_: root=UUID=b572742c-deea-43ec-92d3-b1d1e6b6802f ro quiet splash
RelatedPackageVersions:
 xserver-xorg 1:7.6+4ubuntu3
 libdrm2 2.4.23-1ubuntu6
 xserver-xorg-video-intel 2:2.14.0-4ubuntu7
SourcePackage: xserver-xorg-video-intel
Title: [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020)
UpgradeStatus: Upgraded to natty on 2011-01-18 (92 days ago)
UserGroups: adm admin cdrom couchdb dialout dip floppy fuse lpadmin plugdev video
dmi.bios.date: 12/26/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A15
dmi.board.name: 0N6705
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA15:bd12/26/2008:svnDellInc.:pnXPSM1330:pvr:rvnDellInc.:rn0N6705:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: XPS M1330
dmi.sys.vendor: Dell Inc.
version.compiz: compiz 1:0.9.4+bzr20110415-0ubuntu2
version.libdrm2: libdrm2 2.4.23-1ubuntu6
version.libgl1-mesa-dri: libgl1-mesa-dri 7.10.2-0ubuntu2
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.2-0ubuntu2
version.xserver-xorg: xserver-xorg 1:7.6+4ubuntu3
version.xserver-xorg-video-ati: xserver-xorg-video-ati N/A
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu7
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu7

[lspci]
00:00.0 Host bridge [0600]: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub [8086:2a00] (rev 0c)
     Subsystem: Dell Device [1028:0209]
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) [8086:2a02] (rev 0c) (prog-if 00 [VGA controller])
     Subsystem: Dell Device [1028:0209]

Revision history for this message
Stuart Langridge (sil) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

looks similar to 765416 or 757399; slightly different chip though

description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

<bryceh> aquarius, can you tell me is this a recently upgraded natty from maverick or have you been running natty for a while?
<aquarius> I've been running natty for a while
<bryceh> have you had other gpu freezes since upgrading or is this the first?
<aquarius> since about, um, November.
 This isn't the first, but it's about the third, and they've all happened in the last week or so.
<bryceh> do you know if you can reproduce this freeze at will?
<aquarius> I think. (I've been having compiz-locking-up-but-I-can-switch-to-a-VC hangs for a long time; this is different.)
 I can't reproduce it, sorry

Revision history for this message
Bryce Harrington (bryce) wrote :

I have a suspicion this might be the same as or similar to bugs 767511 and 767425 - different error codes, but same chip and all are regressions that started within the past couple weeks it appears.

Let's let it freeze a couple more times, so you get a definite feel for the frequency and conditions of the freeze. Then, I want you to try booting to an earlier kernel from a week or two ago - prior to when you started noticing the freezes.

Hopefully you still have an old kernel installed; just hold down the left shift key during boot and it should let you select whatever kernels are already there. If not, you can install the older kernels from https://launchpad.net/ubuntu/+source/linux/+publishinghistory. For instance, to install the -8.41 kernel, go here:

https://launchpad.net/ubuntu/+source/linux/2.6.38-8.41/+buildjob/2429621

You'll want the linux-image-2.6.38-8-generic-pae deb; I don't think you will need anything else but if the installer complains you can snag other .debs as needed.

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Revision history for this message
In , Bryce Harrington (bryce) wrote :
Download full text (4.8 KiB)

Forwarding this bug from Ubuntu reporter Stuart Langridge:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/768184

[Problem]
Infrequent gpu lockup on i965.

We've had a handful of reports in the last couple weeks of a gpu lockup on i965 systems which had not had freeze troubles for a long while (>6 months). Most reporters have experienced the freeze only once or twice; they don't know how to reproduce it, nor really have a way to definitively tell whether it is fixed or just occurs rarely.

I'm forwarding this report on the chance that the bug is a recognizable one to upstream; I don't think users are going to be able to pinpoint this down any further.

Bugs I believe to be dupes, all on i965 systems:

768184 IPEHR: 0x01800020
767511 IPEHR: 0x60020100
767425 IPEHR: 0x08000000
757968 IPEHR: 0x14000000

These i965 reports started coming in shortly after when we updated Ubuntu from xserver 1.10.0 to 1.10.1 and mesa from 7.10.1 to 7.10.2 and adding patch 25521900d to -intel (bug #35808). (Due to the intermittency of the bug I haven't had people try downgrading those packages.)

[Original Description]
Crash which required reboot. The crash itself is described in https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/768176 and this is after I persuaded apport-gpu-error-intel.py to run.

My screen went entirely black (both laptop screen and second monitor). Switching to a VC did not show anything on screen. At first I could still hear sounds from running applications, but eventually (after ~10 seconds) they stopped. I had to powercycle the machine to get control back. The "system problem detected" apport dialog offered to let me file a bug, but then I got another crash dialog saying "apport-gpu-error-intel.py closed unexpectedly".

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-4ubuntu7
ProcVersionSignature: Ubuntu 2.6.38-8.42-generic-pae 2.6.38.2
Uname: Linux 2.6.38-8-generic-pae i686
Architecture: i386
Chipset: i965gm
CompositorRunning: compiz
DRM.card0.HDMI.A.1:
status: disconnected
enabled: disabled
dpms: Off
modes:
edid-base64:
DRM.card0.LVDS.1:
status: connected
enabled: enabled
dpms: On
modes: 1280x800
edid-base64: AP///////wAwZAYjMjQ5NTISAQOAHRJ4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBKhwAqFAgHjAQMCIAH7QQAAAYAAAAAAAAAAAAAAAAAAAAAAAAAAAA/gBSUDc3NKMxMzNFV0REAAAA/gAIDBAUKFB/2AEBCiAgAL4=
DRM.card0.VGA.1:
status: connected
enabled: enabled
dpms: On
modes: 1680x1050 1280x1024 1280x1024 1280x960 1152x864 1024x768 1024x768 1024x768 832x624 800x600 800x600 800x600 800x600 640x480 640x480 640x480 640x480 720x400
edid-base64: AP///////wBMLdIDMjJBSCMTAQMOMB54KtxVo1lIniQRUFS/74CzAIGAgUBxTwEBAQEBAQEBITmQMGIaJ0BosDYA2igRAAAcAAAA/QA4Sx5REAAKICAgICAgAAAA/ABTeW5jTWFzdGVyCiAgAAAA/wBIOUZTODM5NDg1CiAgAAI=
Date: Thu Apr 21 10:25:20 2011
DistUpgraded: Log time: 2011-01-18 17:25:59.814253
DistroCodename: natty
DistroVariant: ubuntu
DuplicateSignature: (ESR: 0x00000001 IPEHR: 0x01800020)
ExecutablePath: /home/aquarius/apport-gpu-error-intel.py
GraphicsCard:
Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) [8086:2a02] (rev 0c) (prog-if 00 [VGA controller])
Subsystem: De...

Read more...

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 45980
BootDmesg.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 45981
CurrentDmesg.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 45982
CurrentDmesg.txt

Revision history for this message
Bryce Harrington (bryce) wrote :

Stuart Langridge - I've forwarded this bug upstream to http://bugs.freedesktop.org/show_bug.cgi?id=36515 - please subscribe yourself to this bug, in case they need further information or wish you to test something. Thanks ahead of time!

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
In , Bryce Harrington (bryce) wrote :
Changed in xserver-xorg-video-intel:
importance: Unknown → High
status: Unknown → Confirmed
Bryce Harrington (bryce)
Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → High
Revision history for this message
In , Chris Wilson (ickle) wrote :

Bryce, one aspect that we are wary of with 965G[M] is that the early chipsets had severe issues with memory above 4G. It the memory configuration captured in the LP reports? The attached dmesg has 4G + PAE, is that common?

Revision history for this message
In , Timo Jyrinki (timo-jyrinki) wrote :

One affected 965gm user here (bug report with attachments https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/771655) - 4GB of memory but no PAE, ie. 64-bit. On the other hand my problem, is simply X.org crashing/segfaulting, I don't get apport triggered for a GPU lockup bug report. So sorry for the (possible) noise, even though my problem is clearly coming from the same bunch of changes and is similarly random/rare.

To make up for that, I went through the mentioned lockup bug reports to answer the question and: only two has PAE, four don't have PAE, but all those i965gm GPU lockup reports currently so far seem to be i686 unlike me.

bugbot (bugbot)
description: updated
Bryce Harrington (bryce)
tags: added: oneiric
Revision history for this message
Bryce Harrington (bryce) wrote :

[I've marked this bug for inclusion in our oneiric bug queue. While technically this bug has not been re-confirmed against oneiric, I feel it is worth continued development attention. We will need to ask that it be re-confirmed once oneiric is further along, perhaps once we get closer to alpha.]

Revision history for this message
PresuntoRJ (fabio-tleitao) wrote : Re: [Bug 768184] Re: [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020)

I think I can post a dmesg and i915_error_state files the next time my
machine freezes, but I am not sure how it will react with a recent kernel
upgrade to 2.6.38-9

2011/5/3 Bryce Harrington <email address hidden>

> [I've marked this bug for inclusion in our oneiric bug queue. While
> technically this bug has not been re-confirmed against oneiric, I feel
> it is worth continued development attention. We will need to ask that
> it be re-confirmed once oneiric is further along, perhaps once we get
> closer to alpha.]
>
> --
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (767425).
> https://bugs.launchpad.net/bugs/768184
>
> Title:
> [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020)
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/768184/+subscribe
>

--
Fábio Leitão
..-. .- -... .. --- .-.. . .. - .- --- ...-.-

Revision history for this message
PresuntoRJ (fabio-tleitao) wrote : Re: [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020)

I have been trying to get reproduce the bug to capture dmesg, i915_error_state and even backtraces from my gdm session, but I am not sure what triggers it yet.

So far, it seems that it freezes when the screensaver was running for some time BECAUSE IT HAD TURNED MY SCREEN OFF due to energy policy...

I had since disengaged the energy control option to never turn my screen off again, so far it has not frozen my system again. I suspect that this could be the root cause, but I am not sure how to prove it, or what packages could be involved to further the investigation.

It also seems to happen only on my systems with intel video boards (Mobile GM965/GL960 Lenovo Laptop and Mobile 945GME Express Asus netbook).

Revision history for this message
gmilby (gmilby-r) wrote :

This is happening to me about 3 times per week with natty 11.04 (completely updated). i have a dual core 1.8 foxconn micro pc with intel video (vga + hdmi).
if you let me know where/how to get you information - i will gladly supply it.
tia,

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 48039
Apply the big hammer to finish the fb before disabling it.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 48043
Apply the big hammer to finish the fb before disabling it.

When flushing before disabling, it helps to do it before and not after the disable.

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 48066
dmesg

I think I may have reproduced this same bug on my own i965 finally. Not sure exactly how I did it, but it showed up after a lid open event (resume from sleep I guess). The machine has been plugged into its docking station with external monitor continuously.

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 48067
i915_error_state

IPEHR=0x01820000

Revision history for this message
In , Chris Wilson (ickle) wrote :

I was hoping to see the contents of the display registers in the error state to confirm the theory about the WAIT_FOR_EVENT being on a disabled pipe. Alas, that feature isn't part of that kernel.

Revision history for this message
In , Chris Wilson (ickle) wrote :

May I also make a polite request that you enable pageflipping once more ;-)

I wonder if we should just be waiting for the VBLANK on a full screen blit rather than a range that is impossible. Hmm.

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 35576 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 37450 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

A bug I reported (Bug 37450) has been marked as a duplicate of this bug, and this bug is marked as NEEDINFO.

Since I can reproduce the bug I reported 100% of the time, please let me know if you would like me to provide any additional info.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Kamil, can you try applying the patch https://bugs.freedesktop.org/attachment.cgi?id=48043 to your kernel and seeing if that is sufficient.

I'm confident that's the fix, just waiting for testing.

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

I applied the patch to 2.6.39.3 kernel, but it did *not* help. I'm seeing the same problem as before (enabling an output after suspend/resume hangs the server). Do I need to be running a newer kernel perhaps?

xf86-video-intel: 2.15.0
xorg-server: 1.10.2
mesa: 7.10.3
libdrm: 2.4.26
kernel: 2.6.39.3

Do I need to be running a newer kernel perhaps?

Revision history for this message
In , Chris Wilson (ickle) wrote :

Sigh. After applying the patch can you post an i915_error_state.

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

$ cat /sys/kernel/debug/dri/0/i915_error_state
no error state collected

That's after a restart of the X server (Ctrl-Alt-Bcksp) so that I can access the machine again; I assume that would not reset i915_error_state?

The only indication in the logs I can see is in /var/log/Xorg.0.log:

[ 259.306] (WW) intel(0): flip queue failed: Invalid argument
[ 259.306] (WW) intel(0): Page flip failed: Invalid argument
[ 260.299] (WW) intel(0): flip queue failed: Device or resource busy
[ 260.299] (WW) intel(0): Page flip failed: Device or resource busy
[last two lines repeating]

These start occurring after I enable an output using xrandr (after a suspend/resume cycle); Xorg works for a while, but hangs immediately after I switch to a text console and back to X (a required action to actually see something via the new output, as per https://bugzilla.kernel.org/show_bug.cgi?id=24982).

A workaround that works for me is to modify xf86-video-intel to force intel->use_pageflipping to FALSE. I believe there used to be a user-accessible option to turn it off, but it's been removed? That is rather unfortunate, I must say.

Revision history for this message
In , Chris Wilson (ickle) wrote :

I was just about to add that you hit kernel bug # 24982...

So we can't tell if the GPU lockup itself has been fixed if the second prevents you from testing.

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

Are you saying that *this* bug is probably fixed, but X still hangs because of the (unrelated) DPMS bug in the kernel? That could be, as I no longer see the GPU hung messages.

Well, I guess all I can do at this point is sit and wait for that kernel bug to be fixed, hopefully some time soon; it's been open since last year... I'd be happy to try any patches you guys might have.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Ok, to be really complicated, can you please retest this patch on top of keithp/drm-intel-fixes [ git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6.git]. Hopefully we have the modeswitching bug fixed and so we can then successfully test the WAIT_FOR_EVENT fix...

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

Chris, drm-intel-fixes (last commit
cda2bb78c24de7674eafa3210314dc75bed344a6) does *not* fix the modeswitching bug for me. I guess no point in retesting your patch then?

Revision history for this message
In , Chris Wilson (ickle) wrote :

The patch should prevent the GPU hang upon turning off a pipe, but it is a nuisance if the machine is dying for other reason we can't but sure that the patch is sufficient.

Bryce Harrington (bryce)
summary: - [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020)
+ [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen
+ (dpms?)
bugbot (bugbot)
tags: added: black-screen
Revision history for this message
Bryce Harrington (bryce) wrote :

Potentially, either or both of these options in your xorg.conf might produce more detailed information in your /var/log/Xorg.0.log when these freezes occur. Also capture dmesg and /sys/kernel/debug/dri/0/i915_error_state from while the machine is locked (i.e., ssh into it over ethernet). All three attachments must be collected together at the same time, to ensure consistent data.

  Option "DebugFlushCaches" "True"

and/or

  Option "DebugFlushBatches" "True"

Revision history for this message
In , Eugeni Dodonov (eugeni) wrote :

Hi,

does this still happens with the latest versions of the drivers, or it is not an issue anymore?

Revision history for this message
In , Chris Wilson (ickle) wrote :

Yes, the patch is still required, just no one has volunteered to test it.

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

Well, I would've loved to test it, but I just tried kernel 3.1-rc10 and with vanilla xf86-video-intel 2.16.0 the kernel still crashes for me on enabling an output via xrandr. I assume it's due to the infamous kernel bug 24982, which has probably been open for a year now with no resolution in sight, though with kernel bugzilla apparently still being down (pathetic), it's hard to tell.

For what it's worth, with your patch applied, the kernel seems to crash less easily for me than without it.

Revision history for this message
In , Chris Wilson (ickle) wrote :

(In reply to comment #27)
> Well, I would've loved to test it, but I just tried kernel 3.1-rc10 and with
> vanilla xf86-video-intel 2.16.0 the kernel still crashes for me on enabling an
> output via xrandr. I assume it's due to the infamous kernel bug 24982, which
> has probably been open for a year now with no resolution in sight, though with
> kernel bugzilla apparently still being down (pathetic), it's hard to tell.

bugzilla.kernel.org and that I'm currently unaware of any crash inside i915.ko, so you're going to have to remind me...

Revision history for this message
In , Kamil-42920 (kamil-42920) wrote :

(In reply to comment #28)
> I'm currently unaware of any crash inside i915.ko,
> so you're going to have to remind me...

Chris, please see comment #19 in this bugzilla entry, or, for a complete description, see bug #37450. In essence, it seems that stale DPMS properties (kernel bug 24982), which normally just result in a blank screen, can in some situations result in a crash/hang. When I originally reported it I could only trigger it after suspend/resume; nowadays I can reproduce it just by repeatedly enabling and disabling an output a few times. The only workaround that works for me is modifying the xf86-video-intel driver to force page flipping off.

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi, just checking back in for status. Have you seen more of these lockups since the release or within the last few weeks?

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
PresuntoRJ (fabio-tleitao) wrote : Re: [Bug 768184] Re: [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen (dpms?)

Not really, my intel devices have finally been working smoothly for a while
now... but the nVidia ones went bezerk with the proprietary drivers... a
whole other history, but very similar issue

2011/10/25 Bryce Harrington <email address hidden>

> Hi, just checking back in for status. Have you seen more of these
> lockups since the release or within the last few weeks?
>
> ** Changed in: xserver-xorg-video-intel (Ubuntu)
> Status: Triaged => Incomplete
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (767425).
> https://bugs.launchpad.net/bugs/768184
>
> Title:
> [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen
> (dpms?)
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/768184/+subscriptions
>

--
Fábio Leitão
..-. .- -... .. --- .-.. . .. - .- --- ...-.-

Revision history for this message
In , Chris Wilson (ickle) wrote :

(In reply to comment #29)
> (In reply to comment #28)
> > I'm currently unaware of any crash inside i915.ko,
> > so you're going to have to remind me...
>
> Chris, please see comment #19 in this bugzilla entry, or, for a complete
> description, see bug #37450. In essence, it seems that stale DPMS properties
> (kernel bug 24982), which normally just result in a blank screen, can in some
> situations result in a crash/hang. When I originally reported it I could only
> trigger it after suspend/resume; nowadays I can reproduce it just by repeatedly
> enabling and disabling an output a few times. The only workaround that works
> for me is modifying the xf86-video-intel driver to force page flipping off.

Ok, I think we know that bug and had a fix for the races inside the page-flipping code, but I think Keith dropped them on the floor...

Revision history for this message
Joe (joe-ehrensberger) wrote :

since upgrading to Ubuntu 11.10 everything seems to work smootly.

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 40526 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 40527 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Paulo Zanoni (pzanoni) wrote :

All our 4 duplicates were high/major. Adjusting.

Revision history for this message
Bryce Harrington (bryce) wrote :

Issue is still marked as open upstream, and we've had one instance on precise, so evidently the bug is still out there.

  https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/914131

From the upstream bug report: "In essence, it seems that stale DPMS properties (kernel bug 24982), which normally just result in a blank screen, can in some situations result in a crash/hang."

tags: added: precise
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

For anyone still experiencing issues, please refer to comment #36 of the upstream bug report:

https://bugs.freedesktop.org/show_bug.cgi?id=36515#c36

Revision history for this message
Miika Laaksonen (miika) wrote :

I can confirm that this bug is valid for 11.10:
[ 6710.048030] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 6710.048037] [drm:kick_ring] *ERROR* Kicking stuck wait on render ring

Board info:

Handle 0x0200, DMI type 2, 8 bytes
Base Board Information
 Manufacturer: Dell Inc.
 Product Name: 0GX297
 Version:
 Serial Number: ..CN6986173R0E69.

Handle 0x0A00, DMI type 10, 6 bytes
On Board Device Information
 Type: Video
 Status: Enabled
 Description: Intel Graphics Media Accelerator 950

I have no possibility to play with the testing kernel because this is our family's only machine.

Revision history for this message
Bryce Harrington (bryce) wrote :

From the upstream bug, someone with this issue should install this test kernel and let us know if it makes the issues go away.

"""
Per a recent request, I've built an Ubuntu test kernel based on the latest
drm-intel-fixes branch (eg
git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux.git drm-intel-fixes)
and applied the patch noted in comment 16. If you could please test and post
you results that would be great. Thanks in advance.

http://people.canonical.com/~ogasawara/fdo36515/
"""

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

Anyone still seeing this? I notice there hasn't been a dupe set in quite a while, and there's not been activity either here or on the upstream bug report in about a month. It does not look like the upstream proposed patch has reached the mainline kernel yet, although it's possible the bug got fixed some other way.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Colin Dean (colindean) wrote :

I just tried to reproduce the error described in #745438, marked as a duplicate of this one, and was not able to reproduce the error.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks Colin. There's been several fixes to game-related lockups in the Intel mesa tree recently; I'll bet that did it for you.

Who else can re-test? I'd especially like to hear from anyone that had encountered the IPEHR 0x01800020 error.

Revision history for this message
PresuntoRJ (fabio-tleitao) wrote : Re: [Bug 768184] Re: [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen (dpms?)

Not recently... How could I go on testing it again?
On Mar 14, 2012 2:03 PM, "Bryce Harrington" <email address hidden>
wrote:

> Thanks Colin. There's been several fixes to game-related lockups in the
> Intel mesa tree recently; I'll bet that did it for you.
>
> Who else can re-test? I'd especially like to hear from anyone that had
> encountered the IPEHR 0x01800020 error.
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (767425).
> https://bugs.launchpad.net/bugs/768184
>
> Title:
> [i965gm] GPU lockup (ESR: 0x00000001 IPEHR: 0x01800020) - Black screen
> (dpms?)
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/768184/+subscriptions
>

Revision history for this message
Bryce Harrington (bryce) wrote :

"Not recently... How could I go on testing it again?"

Update to latest Ubuntu, and then play whatever 3D games you'd been playing previously. Play them for several hours at least, in case it's a rare issue on your hardware.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
status: New → Incomplete
Bryce Harrington (bryce)
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Triaged
tags: added: kernel-handoff-graphics
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote :

From upstream bug report.

We're still getting these bugs as of Precise. E.g. bug #981860
---
The first of the fixes has landed:

commit 14667a4bde4361b7ac420d68a2e9e9b9b2df5231
Author: Chris Wilson <email address hidden>
Date: Tue Apr 3 17:58:35 2012 +0100

    drm/i915: Finish any pending operations on the framebuffer before disabling

    Similar to the case where we are changing from one framebuffer to
    another, we need to be sure that there are no pending WAIT_FOR_EVENTs on
    the pipe for the current framebuffer before switching. If we disable the
    pipe, and then try to execute a WAIT_FOR_EVENT it will block
    indefinitely and cause a GPU hang.

Revision history for this message
Andy Whitcroft (apw) wrote :

I have pulled the patch mentioned in comment #58 back to precise(for another bug) and built some test kernels. If those affected could test the kernels at the URL below and let us know if they fix the issue for you. Kernels are at the URL below:

    http://people.canonical.com/~apw/lp839976-precise/

Please report any testing here.

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Michael Marineau (mike-marineau) wrote :

On my laptop (Lenovo X61s) the kernel provided in comment #59 did not make a difference. X still intermittently failed to resume with the page flip error in the log. I noticed that there had been a number of BIOS updates for my machine including one with a new Intel Video BIOS in version 2.18 (released 2008). After updating to the latest version, 2.22, I have not had a problem. New drivers are disagreeing with the old video BIOS on some behavior during resume I guess.

http://support.lenovo.com/en_US/downloads/detail.page?DocID=DS013746

Revision history for this message
Michael Marineau (mike-marineau) wrote :

Boo, never mind on #60, back to X breaking on resume with page flip errors.

Revision history for this message
Chris Wilson (ickle) wrote :

Try raring, both kernel and ddx.

Chris Wilson (ickle)
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.