X server (S3 Savage) freezing with DRI

Bug #41340 reported by Akkana Peck on 2006-04-25
10
Affects Status Importance Assigned to Milestone
xserver-xorg-video-savage (Ubuntu)
Medium
Unassigned

Bug Description

I've been having problems with the X server freezing on my laptop (a Vaio SR17 with an S3 Savage video card). Symptoms sound somewhat similar to bug 30447, but that bug is for ATI cards.

In Breezy: X starts up just fine, either from gdm or from the console. But if I try to exit X, either by exiting back to the console (if I'm not running gdm) or by something like ctl-alt-Backspace from gdm, the machine will lock up about three times out of four: the screen fills with colorful garbage, the keyboard is unresponsive even to keystrokes like ctl-alt-F2, and I can't connect from another machine via the network. I have to pull the power plug. This also sometimes happens when shutting down from within X (e.g. a normal shutdown with the gdm gnome desktop running).

In Dapper Flight 6, it's worse: most of the time I can't even start up the default gnome desktop; the machine displays garbage instead of the normal X startup screen when booting, then freezes (often but not always with the CPU pegged, judging by the sound of the CPU fan). If I turn off gdm I can boot into a console, but startx will cause one of these lockups about half to 2/3 of the time.

I know it's not easy to debug something that leaves the machine unresponsive and only happens on specific hardware. But I'd love to find out what's at fault here if anyone is willing to suggest methods of debugging (feel free to contact me on irc if a real-time session might be helpful).

Here's the machine's xorg.conf from breezy, in case that helps.

Can you attach /var/log/Xorg.0.log also please?

This Xorg.0.log is from Dapper flight 6. The screen went dark, an underscore cursor (sized as for the text console) appeared in the upper left, the ubuntu/gnome start music played, but the screen never changed from the all-dark-except-underscore-cursor. Keyboard is unresponsive.

Can you try disabling DRI? (Just a shot in the dark, but this helps in many cases.) Comment out the Load "dri" line in xorg.conf.

Akkana Peck (akkzilla) wrote :

Good idea! I should have thought of that, since I know a lot of the opengl X screensavers have locked up this machine in the past.

Indeed, I've been able to start and quit X three times in a row since commenting out that line, and startup seems a lot faster, too.

I'll keep an eye on it (and I'll try that on the breezy partition too, and see if it cures the crash-at-shutdown) and will comment again if I see any more lockups, but I suspect you've found the problem.

Tormod Volden (tormodvolden) wrote :

I actually had hard freezes on my laptop with a savage TwisterK, while running the Dapper Beta 2 live installer. The crashes came during partioning or "loading hardware modules".

This happened 3 three times. I could do sysreq-B to reboot, but then it would hang again during the reboot. I had to power off for a while.

I commented out Load "dri" from xorg.conf and restarted X (and unloaded the savage and drm modules) and now I have successfully installed Beta 2, which runs fine without DRI. I will try to reenable DRI and see how it goes.

This was not happening in flight 6, even after having updated to the 2.6.15-21-386 kernel.

Tormod Volden (tormodvolden) wrote :

> This was not happening in flight 6, even after having updated to the 2.6.15-21-386 kernel.

To correct myself, I also had freezes back then when resuming from hibernation (bug #38500).

I reenabled DRI in my Beta 2 system, and first trials on sleep and hibernation (for a short time) works fine. But when I let the machine sit alone for an hour, it had frozen, with a black screen (LCD backlight still on) and I had to power it off.

I've heard that the Savage DRI problems are due to agpgart problems. I no longer have a laptop with a Savage chip, but if you're interested in getting to the bottom of this problem you may want to reopen http://bugzilla.kernel.org/show_bug.cgi?id=4607

Tormod Volden (tormodvolden) wrote :

Thanks Johan. See also https://bugs.freedesktop.org/show_bug.cgi?id=3835 where they say some savage dri crashes might have been fixed in drm 1.0.1.

Akkana Peck (akkzilla) wrote :

This is still a problem on edgy (at least under xubuntu): it hangs the machine when you reboot into the newly installed OS. I had to boot from another partition and edit xorg.conf to remove the dri line before I could use edgy.

Tormod Volden (tormodvolden) wrote :

Does the machine hang completely, or just the screen and keyboard? Can you try to ssh in from another machine, before starting X with dri enabled? You might try running "sudo cat /proc/kmsg" in the ssh session to see if any error messages appear. You can also (before starting X) unload the savage and drm modules, and reload the drm module with "modprobe drm debug=1" which gives a lot of debug messages.

Akkana Peck (akkzilla) wrote :
Download full text (7.5 KiB)

It's just the screen and keyboard: if I ssh in first, I still have a working shell after X has been started (and hung).

Nothing shows up in /proc/kmsg without debug messages, but with debug=1, I see this:

<7>[17179854.036000] [drm:drm_stub_open]
<7>[17179854.036000] [drm:drm_stub_open]
<7>[17179854.368000] [drm:drm_init]
<7>[17179854.368000] [drm:drm_get_dev]
1:00.0[A] -> Link [LNKA] -> GSI 9 (level, low) -> IRQ 9
<7>[17179854.372000] [drm:drm_ctxbitmap_next] drm_ctxbitmap_next bit : 0
<7>[17179854.372000] [drm:drm_ctxbitmap_init] drm_ctxbitmap_init : 0
<7>[17179854.372000] [drm:drm_get_head]
<7>[17179854.372000] [drm:drm_get_head] new minor assigned 0
<6>[17179854.372000] [drm] Initialized savage 2.4.1 20050313 on minor 0
<4>[17179854.376000] mtrr: 0xf0000000,0x1000000 overlaps existing 0xf0000000,0x800000
<7>[17179854.376000] [drm:drm_addmap_core] offset = 0xf1000000, size = 0x00080000, type = 1
<4>[17179854.376000] mtrr: 0xf0000000,0x1000000 overlaps existing 0xf0000000,0x800000
<7>[17179854.380000] [drm:drm_addmap_core] offset = 0x00000000, size = 0x00002000, type = 2
<7>[17179854.380000] [drm:drm_setup]
<7>[17179854.380000] [drm:drm_ioctl] pid=4324, cmd=0xc0086401, nr=0x01, dev 0xe200, auth=1
<7>[17179854.380000] [drm:drm_ioctl] pid=4324, cmd=0xc0106407, nr=0x07, dev 0xe200, auth=1
<7>[17179854.380000] [drm:drm_addmap_core] offset = 0x00000000, size = 0x00002000, type = 2
<7>[17179854.384000] [drm:drm_mmap] start = 0xb21f1000, end = 0xb21f3000, offset = 0xccbe3000
<7>[17179854.384000] [drm:drm_vm_open] 0xb21f1000,0x00002000
<7>[17179854.384000] [drm:drm_do_vm_shm_nopage] shm_nopage 0xb21f2000
<7>[17179854.384000] [drm:drm_addmap_core] offset = 0xf0000000, size = 0x00800000, type = 0
<7>[17179854.384000] [drm:drm_addmap_core] Matching maps of type 0 with mismatched sizes, (8388608 vs 16777216)
<7>[17179854.384000] [drm:drm_ioctl] pid=4324, cmd=0xc0086426, nr=0x26, dev 0xe200, auth=1
<7>[17179854.384000] [drm:drm_ioctl] pid=4324, cmd=0xc0246400, nr=0x00, dev 0xe200, auth=1
<7>[17179854.388000] [drm:drm_ioctl] pid=4324, cmd=0x80206433, nr=0x33, dev 0xe200, auth=1
<7>[17179854.388000] [drm:drm_ioctl] pid=4324, cmd=0x80206433, nr=0x33, dev 0xe200, auth=1
<6>[17179854.388000] agpgart: Found an AGP 1.0 compliant device at 0000:00:00.0.
<6>[17179854.388000] agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode
<6>[17179854.388000] agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode
<7>[17179854.392000] [drm:drm_ioctl] pid=4324, cmd=0x40086436, nr=0x36, dev 0xe200, auth=1
<7>[17179854.396000] [drm:drm_ioctl] pid=4324, cmd=0xc0186415, nr=0x15, dev 0xe200, auth=1
<7>[17179854.396000] [drm:drm_ioctl] pid=4324, cmd=0xc0186415, nr=0x15, dev 0xe200, auth=1
<7>[17179854.396000] [drm:drm_ioctl] pid=4324, cmd=0xc0186415, nr=0x15, dev 0xe200, auth=1
<7>[17179854.400000] [drm:drm_ioctl] pid=4324, cmd=0xc0186415, nr=0x15, dev 0xe200, auth=1
<7>[17179854.400000] [drm:drm_ioctl] pid=4324, cmd=0xc0186415, nr=0x15, dev 0xe200, auth=1
<7>[17179854.400000] [drm:drm_mmap] start = 0xb21f0000, end = 0xb21f1000, offset = 0x5b74000
<7>[17179854.400000] [drm:drm_vm_open] 0xb21f0000,0x00001000
<7>[17179854.400000] [drm:drm_addbufs_agp] count: 3...

Read more...

Tormod Volden (tormodvolden) wrote :

Akkana, I don't see any particular bad in that debug output. OTOH, since your kernel is not crashing anyway, there is not necessarily anything wrong with the drm kernel module in your case. Can you try "top" and see what's busy? I guess the Xorg server is stuck on something, in case it would be nice if you could get a trace or stack dump: https://wiki.ubuntu.com/DebuggingProgramCrash

My bug is maybe a different one, because my machine hangs for real (kernel freeze).

Akkana Peck (akkzilla) wrote :

There's no additional process that shows up, under top or pstree. strace on the X process once it's hung shows:

10:00:47.970272 select(256, [1 3 4 10 11 12], NULL, NULL, {447, 556000}

Typing keys on the keyboard does break that select() and cause things to happen in strace -- so I guess X is at least semi-alive and getting keyboard events, just not reacting to them. (ctl-alt-backspace causes strace activity but doesn't cause the X server to quit.)

Just in case it helps anybody, I'll attach a log of what strace shows when I type ctl-alt-Backspace. I'm not seeing anything obvious here, but maybe someone more familiar with X might see something.

Did I mention that killing X doesn't bring the machine back? The X process dies but the screen doesn't come back, and sequences like ctl-alt-F2 don't help. Worse, I can't reboot cleanly either: typing reboot from the remote shell prints a message that it's going to reboot, breaks the network connection, then the machine goes into a CPU-spinning loop and pulling the plug is the only solution I've found.

Ashley Hooper (ash-hooper) wrote :

S3 Savage DRI has now been fixed with Alex Deucher's patch for the Xorg 7.1 Savage module - see here: https://bugs.freedesktop.org/show_bug.cgi?id=6357

Can we possibly get this included for Feisty?

Tormod Volden (tormodvolden) wrote :

This was already fixed in Edgy, see bug #46314. The ubuntu changelog entry is wrong, it says fd.o. bug 7041, but it should have said fd.o. bug 6357 (the patch was attachment number 7041...).

Timo Aaltonen (tjaalton) wrote :

Feisty has a newer driver which has this fix.

Changed in xserver-xorg-video-savage:
status: Unconfirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers