drm crash, possibly because of savage video-driver

Bug #37218 reported by Eric Feliksik
18
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
X.Org X server
Confirmed
High
linux-source-2.6.15 (Ubuntu)
Invalid
Medium
Unassigned
linux-source-2.6.20 (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

I didn't use the computer for a few hours. When coming back:
- screen wouldn't go on when moving mouse (IIRC, the led *was* green; screen was just black)
- num-lock on keyboard didn't respond
- no response to ping-requests

The system had crashed. This happened more often, the last weeks. The same situation is often visible when my system recovers from hibernation; going down went fine, recovering seems to be ok, but at the moment I'd expect X to come up the system wouldn't respond anymore. (sometimes coming up from hibernation works, however)

syslog:
Mar 29 20:48:24 dapperdrake kernel: [4326409.757000] [drm:savage_bci_wait_event_shadow] *ERROR* failed!
Mar 29 20:48:24 dapperdrake kernel: [4326409.758000] [drm] status=0x000013ff, e=0x1407

kern.log:
kern.log:Mar 29 20:48:24 dapperdrake kernel: [4326409.757000] [drm:savage_bci_wait_event_shadow] *ERROR* failed!
kern.log:Mar 29 20:48:24 dapperdrake kernel: [4326409.758000] [drm] status=0x000013ff, e=0x1407

Note that I use the savage driver because I have a savage8 chip:
eric@dapperdrake:~$ lspci | grep 00:01: ; lspci -n | grep 00:01:
0000:01:00.0 VGA compatible controller: S3 Inc. VT8375 [ProSavage8 KM266/KL266]
0000:01:00.0 0300: 5333:8d04

This driver isn't used by default (vesa is), but that's a bug:
https://launchpad.net/distros/ubuntu/+source/discover1-data/+bug/32377

Let me know if you need more info.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Please follow the instructions at https://wiki.ubuntu.com/DebuggingSystemCrash if you are able to reproduce the bug

Changed in linux-source-2.6.15:
status: Unconfirmed → Needs Info
Revision history for this message
Eric Feliksik (milouny) wrote : kern.log directly after crash

I tried it several times after the system came back from hibernation (so not just a screensaver/sleep-issue) without success:
I just booted, and at the moment the crash often (but not always) happens, when X should kick in, I quickly moved the mouse. Then X actually *did* kick in (maybe it was just time, or maybe the screensaver was de-activated???)
The desktop background was displayed with a black rounded rectangle in the middle (maybe the remains of log-out window). So the mouse-pointer responded for a second, then the system did not respond anymore.

I followed DebuggingSystemCrash instructions:
1) what's posted below is, I'm afraid, not the output of the dump. It is, however, the complete kern.log (I booted from a live-cd now)
2) My SysRq is on my PrintScrn key, should I use some extra modifying key on top of Alt+SysRq+1 ?

See the attached kern.log below. )

The screen was black, but went on, displayed the desktop background with a black rounded rectangle in the middle (maybe the remains of log-out window). So the mouse-pointer responded for a second, then the system did not respond anymore.

I followed DebuggingSystemCrash instructions:
1) the attached kern.log does not, I'm afraid, contain the output of the dump. It is all there is, however (I booted from live-cd after crash)
2) My SysRq is on my PrintScrn key, should I use some extra modifying key on top of Alt+SysRq+1 ?

I'm afraid the system was not 'sufficiently alive', so I also couldn't get output from tty1 or something.

on kern.log: See the crashes on "Mar 31 14:09:43", and "Apr 5 10:48:59".

Revision history for this message
Eric Feliksik (milouny) wrote : Re: system-crash, possibly because of bad video-driver

Note: I also followed the instruction of DebuggingSystemCrash to load the bios failsave defaults, because of AGP video issues. After that I only enabled my USB devices.
The savage8-chip is on a shuttle sk41g motherboard.

Revision history for this message
In , Slavag (slavag) wrote :

After resuming from software suspend (which involves switching to text VT and
back) DRI works only partialy. Simple applications (like glxgears or gl-117 (the
3d game with rather simple 3d graphics)) work well, but more complex
applications don't. For example, Blender doesn't draw labels on buttons, draws
some garbage instead of menus and outputs some garbage in the top left coner of
the screen (above all other windows). More complex 3d games (like tuxracer)
cause immediate lockup. Before suspending (and after restart of Xserver)
everything works well.

Revision history for this message
In , Erik Andrén (erik-andren) wrote :

Please post xorg.conf and your xorg.log after resuming from suspend.
Also a backtrace of the hanging would be nice.

Revision history for this message
In , Slavag (slavag) wrote :

Created attachment 5364
My xorg.conf

Revision history for this message
In , Slavag (slavag) wrote :

Created attachment 5365
Xorg.log after resuming from suspend

Revision history for this message
In , Slavag (slavag) wrote :

Created attachment 5366
Screenshot of Blender's window after resuming from suspend

Blender was started _after_ resuming from suspend (it renders all of its GUI
through OpenGL). Notice the absense of any text in its window and garbage in
the center of screen (where should be the pop-up menu) and in the top-left
corner of the screen above window frame. Before resuming everything rendered
correctly.

Revision history for this message
In , Slavag (slavag) wrote :

My xorg version is 6.8.99.901 (6.9.0 RC 1) (Minimal DRI build from X.org tree)
frow dri.freedesktop.org, using savage driver from savage-20060115 snapshot.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Please see my duplicate bug for more debug drm output. I am not able to get a system trace using sysrq-t, because the system freezes so quickly and then doesn't react any longer.

0000:01:00.0 VGA compatible controller: S3 Inc. VT8636A [ProSavage KN133] AGP4X VGA Controller (TwisterK) (rev 01)
0000:01:00.0 0300: 5333:8d02 (rev 01)

Changed in linux-source-2.6.15:
status: Needs Info → Confirmed
Revision history for this message
Tormod Volden (tormodvolden) wrote : sysrq-t output

I managed to get this (through a running ssh connection) just as the system hung. This was after resuming from sleep, everything seemed normal, until I started "trigger" which now crashed even before the real game started. (386 kernel)

The trace output is probably not complete, but limited by the buffer size I had in the remote ssh terminal window.

Revision history for this message
Tormod Volden (tormodvolden) wrote : kern.log

This kern.log includes the following:
Cold boot, playing trigger for a long time.
Crashed - but was able to sysrq-t it seems.
Was able to reboot with sysrq-b.

After this warm boot, the old trigger screen came up when X started. The gdm spinning wait cursor came on top. And it was hung. Used sysrq-t and sysrq-s, I could see the hard drive LED flash a little in reaction to syncing. But after some time, no reaction to anything and I powered off with the power button.

This was using the 2.6.15-23-386 kernel. I had (just after the cold boot) disabled the savage ShadowStatus in xorg.conf (and restarted X) to if see if that could help. But obviously it didn't make much difference.

Revision history for this message
In , Slavag (slavag) wrote :

I have found the following in my syslog after system (or only X11?) lockup.
Lockup was caused by running screensaver 'starwars' after resuming system from
suspend-to-disk. I have also noticed small amount of garbage in the top left
corner of the screen (like on screenshot with Blender's window).

May 28 00:21:59 nout kernel: [drm:savage_bci_wait_event_shadow] *ERROR* failed!
May 28 00:21:59 nout kernel: [drm] status=0x00008b1f, e=0x8b2b

There was no other messages related to drm in syslog or Xorg.log

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I still get this with edgy beta:
kernel 2.6.17-10.28
drm 1.0.1 20051102
savage 2.4.1 20050313
xserver-xorg-video-savage 1:2.1.1-0ubuntu3

If I try to run glxgears after hibernation, the machine locks up hard with this last message:

Oct 10 02:42:14 viki kernel: [17192405.404000] [drm:savage_bci_wait_event_shadow] *ERROR* failed!
Oct 10 02:42:14 viki kernel: [17192405.404000] [drm] status=0x00003d7e, e=0x3d88

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Just for the information, since so many bugs are intertwined here:
In conjunction with bug #38500, I verified that the glxgears crash is due to real hibernation, and not all the video mode juggling done before/after. The video mode save/restore etc borkes the consoles, but I get no crashes running glxgears afterwards.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I can not reproduce this If I put in xorg.conf: Option "DmaMode" "None"

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I also tried the patch from https://bugs.freedesktop.org/show_bug.cgi?id=8662 but it didn't help.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Option "DmaType" "PCI" also helps, I could run glxgears and many GL screensavers. But I sometimes had small garbage squares in the top left corner, maybe like those mentioned in the upstream bug, and eventually it crashed (full freeze).

Changed in xorg-server:
status: Unknown → Confirmed
Revision history for this message
In , Bugzi11-fdo-tormod (bugzi11-fdo-tormod) wrote :

I have this problem on Ubuntu 6.10 RC: https://launchpad.net/bugs/37218
Option "DmaMode" "None" seems to help, Option "DmaType" "PCI" helps a little
less. Can we provide more information that would help you to track this down?
Like register dumps before/after hibernation or more debug output from drm?

Revision history for this message
robepisc (robepisc) wrote :

Sadly "Option DmaMode None" did not help here.

I'm using Xubuntu 6.10.
My video card is a "S3 Inc. VT8636A [ProSavage KN133] AGP4X VGA Controller (TwisterK) (rev 01)"

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Robepisc, you used quotes at the right place? Do you also get the drm:savage_bci_wait_event_shadow error message? Can you get a stack trace? Please attach xorg.conf and Xorg.0.log if you can.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Even with Option "DmaMode" "None" I can pretty well reproduce the crash by running the mirrorblob screensaver after hibernation. But in the debug output (see attached) I can not see the drm:savage_bci_wait_event_shadow error message.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

If I use Option "BusType" "PCI" I can run mirrorblob after hibernation, even if I reenable DmaMode to Any (default). I still see some garbage in the upper left corner when running some screensavers.

To summarize my experiences (in chronological order), default setting in parenthesis:

Bustype + DmaType + DmaMode -> result
(AGP) + (AGP) + (Any) -> crashes after hibernation with glxgears
(AGP) + (AGP) + None -> crashes after hibernation with mirrorblob
(AGP) + PCI + (Any) -> works a bit, but eventually crashes (with mirrorblob?) (garbage in corner)
PCI + (PCI) + None -> good
PCI + (PCI) + (Any) -> good (garbage in corner)

The table is not complete, and I might need to update it over time. All these reboots take time...

Revision history for this message
Cyberlion (rodrigoleao) wrote :

I have the same problem... please post you xorg.conf.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Cyberlion, see also bug #60288 which I think is a separate issue. I still get crashes after hibernation. What version are you using?

Revision history for this message
Ben Collins (ben-collins) wrote :

I see no confirmation that this was tested and still broken in 2.6.17, much 2.6.20.

Changed in linux-source-2.6.20:
status: Unconfirmed → Rejected
Revision history for this message
Tormod Volden (tormodvolden) wrote :

When I reported 2007-01-24 that I still get the crashes, it was with the newest kernel at that time. I am currently running 2.6.20-7-generic. Will try Herd-4 (also with xorg 7.2) soon.

Changed in linux-source-2.6.20:
status: Rejected → Unconfirmed
Revision history for this message
Ben Collins (ben-collins) wrote :

These aren't crashes, these are error messages because mesa and drm are out of sync. Most likely the mesa driver needs to be updated, or someone needs to point out what changes the kernel needs.

Changed in linux-source-2.6.20:
status: Unconfirmed → Rejected
Changed in linux-source-2.6.15:
status: Confirmed → Rejected
Revision history for this message
Ben Collins (ben-collins) wrote :

Woops, misread this initially.

Changed in linux-source-2.6.20:
status: Rejected → Confirmed
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
Changed in linux-source-2.6.15:
assignee: nobody → ubuntu-kernel-team
status: Rejected → Confirmed
Revision history for this message
In , Daniel Stone (daniels) wrote :

Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Still crashing with Feisty Beta and kernel 2.6.20-13-generic. After hibernation things look OK until I run for instance glxgears and it freezes immediately before I can see any window appear.

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
Ben Collins (ben-collins) wrote :

Removing milestone. Wont be fixed for release.

Changed in linux:
status: Confirmed → In Progress
Revision history for this message
In , Slava-f (slava-f) wrote :

Now, with xf86-video-savage-2.1.2 and savage drm modules from git.freedesktop.org (this version of the driver doesn't work with drm shipped with the kernel) the behavior similar to described above (Blender doesn't render any text, garbage on the screen) is observed even without software suspend (e. g. right after clean system boot). Simple OpenGL applications (like glxgears) work fine, more complex (like Blender) aren't. The garbage pattern is different from one shown above, I'll post the screenshot later.

Revision history for this message
In , Slava-f (slava-f) wrote :

Created attachment 10431
Blender window, just after system boot

Here is the screenshot of Blender's window, just after the system boot (without any attempts to suspend/resume) with xf86-video-savage-2.1.2 and drm modules from git.freedesktop.org. You can see that the problems, that arose only after suspend+resume with savage-2.0 and drm from the kernel (like incorrect rendering of complex objects, like fonts in Blender), now (with savage-2.1 and drm from freedesktop.org) emerge right after clean boot, without suspend/resume.

Revision history for this message
In , Slava-f (slava-f) wrote :

Created attachment 10432
Xorg.log (xf86-video-savage-2.1.2, drm from git.freedesktop.org)

Xorg.log. No suspend/resume, just normal boot. The problem described above exists.

Revision history for this message
Sergio Zanchetta (primes2h) wrote :

The 18 month support period for Feisty Fawn 7.04 has reached it's end of life. As a result, we are closing the linux-source-2.6.20 Feisty Fawn kernel task. However, please note that this report will remain open against the actively developed kernel. Thank you for your continued support and help as we debug this issue.

Changed in linux-source-2.6.20:
status: Confirmed → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Changed in linux-source-2.6.15 (Ubuntu):
status: Confirmed → Invalid
Changed in xorg-server:
importance: Unknown → High
Changed in linux:
status: In Progress → Fix Released
Changed in xorg-server:
importance: High → Unknown
Changed in linux:
importance: Unknown → Medium
Changed in xorg-server:
importance: Unknown → High
Revision history for this message
In , Jeremy Sequoia (jeremyhu) wrote :

Is this a kernel bug? Should this be closed out?

Revision history for this message
In , Ajax-a (ajax-a) wrote :

There's no longer a savage DRI driver.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.