[i965] random crash with vertical stripes

Bug #309927 reported by Vu Ngoc San
4
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
xserver-xorg-video-intel (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

I'm having this bug for one year with kubuntu 7.10,8.04 and 8.10 on a Dell latitude D630, with Intel GM965 video card.

Randomly, the system crashes with some kind of vertical stripes (or 'ladders') across the screen.
Keyboard and mouse is not responsive. Even ALT+SYS don't work. I have to hard reboot using the power button.
See screenshot (from hardy).

It can happen when I'm working, but it seems that most of the time it happens when I do nothing.
Sometimes it happens a few seconds after login dialog (like in screenshot).

It is very difficult to debug because it occurs about only once every two weeks.

Is see no timed trace in log files. Only thing that could be related is
(EE) intel(0): underrun on pipe B!
in Xorg.0.log
That's why I put it under xserver-xorg-video-intel (I'm using now version 2:2.4.1-1ubuntu10)
I use a dual screen setup in xorg.conf (even when I'm not connected to an external monitor), but I think it happened also with a plain setup.

Interestingly it has happen a couple of time when I was listening to music, and after crash I could still hear some distorted sound...

First I thought it was hardware problem, but all tests passed successfully and I even had Dell change the motherboard.

I also ran kubuntu from an external HD, and it crashed again anyway.

So you guys are my last hope :)

[lspci]
00:00.0 Host bridge [0600]: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub [8086:2a00] (rev 0c)
     Subsystem: Dell Device [1028:01f9]
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller [8086:2a02] (rev 0c)
     Subsystem: Dell Device [1028:01f9]

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :
Revision history for this message
Geir Ove Myhr (gomyhr) wrote :

Thank you for reporting this bug, and for providing an illustrative screenshot. Could you also upload the following:
- The output of `lspci -vvnn`
- /var/log/Xorg.0.log
- /etc/X11/xorg.conf

You say that the keyboard is not responsive. Does that also mean that CapsLock does not turn on and off the light on the keyboard? If the light is unresponsive, it probably means that the bug is in the kernel. Does anything show up in /var/log/kern.log at the time of a crash?

Changed in xserver-xorg-video-intel:
status: New → Incomplete
Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

I had another crash tonight at 18:37. As often, it occurred at the login dialog, when I left it for a couple of minute after booting without doing anything. I took a screenshot and can upload it if you need it, but it is basically the same as the one I already put here.

I forgot to test CAPS-LOCK but my guess is that is was not responsive either.

Here are the requested files, obtained immediately after I did a hard reboot.

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

- /var/log/Xorg.0.log
(after reboot after crash)

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

- /var/log/Xorg.0.log.old
(just before 18:37 crash)

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

xorg.conf

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

I see nothing special in kern.log at 18:37

here is the file. I hope I may send it publicly.

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote : Re: [Bug 309927] Re: random crash with vertical stripes

> You say that the keyboard is not responsive. Does that also mean that
> CapsLock does not turn on and off the light on the keyboard? If the
> light is unresponsive, it probably means that the bug is in the kernel.
> Does anything show up in /var/log/kern.log at the time of a crash?

Indeed, I confirm that CapsLock does not turn on and off the light on the
keyboard.

Also, Last time I checked, I did a hard reboot, and after the reboot the
screen had bad pixels all over, but the system was running OK. The bad pixels
disappeared after an X restart.

Revision history for this message
Geir Ove Myhr (gomyhr) wrote :

The fact that CapsLock didn't work suggests that this is a kernel issue. Since I'm not an expert on either, I will set the status to confirmed for xserver-xorg-video-intel (it will be set to invalid if it turns out to be a kernel issue) and open it for linux (the kernel) as well.

I will guess that this will be a difficult kernel bug, since it's so long between every time it occurs. The main thing that can lead the kernel guys to the cause is what the kernel dumps to the console when it crashes. This is hard to get - since you either need to configure an external console (see https://wiki.ubuntu.com/KernelTeam/Netconsole) or get a picture of the text console when the kernel has crashed. Since the crash usually happens when you're not working on the computer, maybe you can make it a habit to do a Ctrl+Alt+F1 to have the console up (Ctrl+Alt+F7 takes you back to the graphical interface). When it crashes you then take a photo of the monitor.

General information about kernel bug is available at https://wiki.ubuntu.com/KernelTeamBugPolicies . You can already include the Minimal information noted there (you already have the output of `lspci -vvvnn`).

PS: Anyone with more kernel knowledge than me, feel free to correct me if anything I wrote was inaccurate.

Changed in xserver-xorg-video-intel:
status: Incomplete → Confirmed
Bryce Harrington (bryce)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

This sort of sounds like another GM45 lockup error that was recently fixed:
https://bugs.freedesktop.org/show_bug.cgi?id=17292

Please test with latest updates to Jaunty and see if the bug still exists.

Changed in xserver-xorg-video-intel:
status: Confirmed → Incomplete
Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

Thanks for the suggestion.

is it possible to test the new intel module without using Jaunty ?
Since the bug appears randomly once every other week, I would need to work everyday on Jaunty for a while, and I fear I cannot afford this (Jaunty is only an Alpha version)

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :
Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :
Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

No, this one really does need tested against jaunty.

It's too bad you're not able to test it; I'm fairly sure it's fixed though so will close it at this time. Feel free to reopen after you have time to test on jaunty if it still occurs, in which case we'll need you to attach a backtrace - see http://wiki.ubuntu.com/X/Backtracing for guidance.

Changed in xserver-xorg-video-intel:
status: Incomplete → Fix Released
Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

Well I have finally installed jaunty (alpha 5) on an external hard drive...
and it crashed after about 10h ! :(
See screenshot

This is so annoying and I really have no clue !

I'll try backtracing. Also maybe I should contact Dell again.

Revision history for this message
Vu Ngoc San (san-vu-ngoc) wrote :

Finally it seems that it was a hardware problem.
Although memtest86 ran for 40h without any error, RAM was my last hope and I convinced Dell to change it.
Now it's been 2 months without crash. So I guess that was it !
(phew... almost 2 years of bug searching...)

Sorry for this; apparently this was not a ubuntu bug !

Revision history for this message
Geir Ove Myhr (gomyhr) wrote :

Thank you for letting us know. Closing for linux as well.

Changed in linux (Ubuntu):
status: New → Invalid
Changed in xserver-xorg-video-intel (Ubuntu):
status: Fix Released → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.