[X700] server looping and system hang (DRILock+0xfa)

Bug #295904 reported by Tormod Volden
4
Affects Status Importance Assigned to Milestone
xserver-xorg-driver-ati
Fix Released
Medium
xserver-xorg-video-ati (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-ati

Quite often (in the order of once per hour) X freezes, and in a few seconds the whole system is unresponsive and does not react to alt-sysrq. Happens mostly with compiz enabled.

(I have been running Intrepid off a USB drive for quite some time with few problems, but after installing it to the internal laptop drive, these sudden hangs have appeared, so I wonder if there's a correlation, like heavy PCI bus traffic screwing up. It's pretty much always when using firefox, and for instance typing in the URL bar (which causes sqlite activity). My internal drive does not work faster than the USB drive though (probably because it runs 4200rpm and the USB one 5400*), but maybe the bus traffic has more intensive peaks when doing IDE than when going over USB? Or these observations are just coincidences...)

*) The internal drive had some trouble with the new drivers in Hardy, and I had to add ata_piix to /etc/initramfs-tools/modules (bug #223235). This was solved, but maybe there's a performance regression.

Backtrace:
0: /usr/X11R6/bin/X(xorg_backtrace+0x3b) [0x813164b]
1: /usr/X11R6/bin/X(mieqEnqueue+0x237) [0x8110ba7]
2: /usr/X11R6/bin/X(xf86PostMotionEventP+0xc2) [0x80d7d72]
3: /usr/X11R6/bin/X(xf86PostMotionEvent+0x68) [0x80d7ed8]
4: /usr/lib/xorg/modules/input//synaptics_drv.so [0xa35fb426]
5: /usr/lib/xorg/modules/input//synaptics_drv.so [0xa35fdae9]
6: /usr/X11R6/bin/X [0x80cc467]
7: /usr/X11R6/bin/X [0x80b089c]
8: [0xb7f60400]
9: /usr/lib/xorg/modules/extensions//libdri.so(DRILock+0xfa) [0xb7a3fe1a]
10: /usr/lib/xorg/modules/extensions//libdri.so(DRIDoWakeupHandler+0x51) [0xb7a3fe81]
11: /usr/lib/xorg/modules/extensions//libdri.so(DRIWakeupHandler+0x6b) [0xb7a3ef2b]
12: /usr/X11R6/bin/X(WakeupHandler+0x52) [0x8090662]
13: /usr/X11R6/bin/X(WaitForSomething+0x1bb) [0x812e87b]
14: /usr/X11R6/bin/X(Dispatch+0x7e) [0x808c61e]
15: /usr/X11R6/bin/X(main+0x47d) [0x8071d6d]
16: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb7b67685]
17: /usr/X11R6/bin/X [0x8071151]
ERROR: Server Lockup! Stuck in an infinite loop. See backtrace above.
<<repeated 1317 times>>

Revision history for this message
Tormod Volden (tormodvolden) wrote :
Revision history for this message
Tormod Volden (tormodvolden) wrote :

I tried running without compiz, and it seems to help. But then it got hung running googleearth, so maybe it's some 3D acceleration that triggers it, and that compiz does 3D when it displays the drop-down URL list in firefox.

Checked my USB installation again, and there I had compiz disabled, so that would explain it not hanging. So the different disks and buses have probably nothing to do with it.

When it's hung, I can move the cursor around, although alt-sysrq doesn't seem to react. There seems to be some reaction to the power button.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

After installing xorg-server 1.5.3 from ~xorg-edgers PPA, I was able to get this backtrace, after playing with googleearth and compiz for a while. (Hint: go to Placemark is a nice stresstest)

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati:
status: New → Confirmed
Revision history for this message
In , Bugzi09-fdo-tormod (bugzi09-fdo-tormod) wrote :

Created an attachment (id=20512)
log with "infinite loop" backtrace

(I am not sure where to file this, but I guess it is radeon specific)

Xorg locks up from time to time. Typically once an hour when using compiz and firefox, rather quickly when using googleearth and for instance zooming in to a "placemark" by double clicking on it in the left pane list - maybe also mouse movements just after helps trigger it.

I am using Ubuntu 8.10 with these upgrades:
- drm modules and libdrm from git master
- mesa from git mesa_7_2_branch
- xserver from git server-1.5-branch
- radeon from git master

Default xorg.conf (no options), XAA.

01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Radeon Mobility X700 (PCIE) [1002:5653]

Changed in xserver-xorg-driver-ati:
status: Unknown → Confirmed
Revision history for this message
Javier Noval (javiernoval) wrote :

This same bug affects two Dell computers here, both with Intel graphics (865G and 945G), so I think that [not only] the driver is guilty here, but the X server too. Both computers are running Intrepid, one was updated from Hardy and the other one is a new install. The bug is usually fired after a weekend of inactivity.

I'm attaching the log of the latest hang in one of those computers, in case it may help. The fatal "infinite loop" message appears first at line 673350, before that the X server insisted on probing the monitor every half a minute or so (I think).

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Javier, so why would that be the same bug? Please file new bugs and attach your logs there.

Revision history for this message
Khashayar Naderehvandi (khashayar) wrote :

Tormod: This is really similar to LP #305979, which concerns intel chipsets. For that bug, there's an upstream fix. You might want to take a look at that, it might reveal some useful information.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Khashayar, the backtrace in bug #305979 looks quite different to mine.

description: updated
Revision history for this message
Khashayar Naderehvandi (khashayar) wrote :

You're probably right, Tormod, I didn't look too much into it. It's just that what triggers the bugs were identical (normal usage of compiz and/or opengl apps), and that your first log - the one with no backtrace - had an eerie resemblance to a lot of my logs (i.e. "[mi] EQ overflowing. The server is probably stuck in an infinite loop." ad nauseam). Also, your comment 2 explains exactly how my system behaved.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Yes, the "[mi] EQ overflowing" is a very generic message, it basically just tells that the graphics card got hung, for which there can be a whole lot of different reasons, or bugs that trigger it. And this typically happens with heavy 3D usage, when lots of card programming is being done.

But for each bug there will usually be a specific fix needed, so it is better to track them independently.

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Tormod, still seeing this problem in Jaunty?

Changed in xserver-xorg-video-ati:
status: Confirmed → Incomplete
Revision history for this message
André Pirard (a.pirard) wrote :

Same problem on an ASUS Z9200 laptop.
Display is ATI Mobility X700 (PCIE).
Extra fresh plain-vanilla Gnome 8.10 started to freeze.
Getting it up to date didn't help.

Seems to happen in the middle of anything.
E. g. dropping user switching (top right) menu down.
Also, caught it in the act of dimming a window.
So, I'm trying running with no Appearance Visual Effects.
Me thinks that's what your 'compiz' secret word means :-)
Most regretfully because of Bug #202456.

Also, Xsorg may happen to do bad initialization.
The screen is barely readable.
Full of interference-like, jerking horizontal lines.
Only thing to do is leaving the session or GDM screens.

There seems to be no hardware fault, Windows seems fine.
But this laptop is a gift from France to a Belgian relative of mine.
I'm just installing & configuring, (& _trying_ to promote) Ubuntu.
So, I'll try to call them and find out and _try to_ follow up.
This not being e-mail, I can't delay sending.

Not being able to run for 1 hour deserves Importance High.

Plz have a look at Bug #306444 (X700 too & neat data)

Revision history for this message
Tormod Volden (tormodvolden) wrote :

André,
> Full of interference-like, jerking horizontal lines.
I sometimes see this when I boot, or after logging out (X is restarted). It is always fixed by switching to a virtual console and back with ctrl-alt-F1 then alt-F7.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

> Hi Tormod, still seeing this problem in Jaunty?

Bryce, sorry, I have been running without compiz since then, to get things done :) I have tried compiz (and googleearth) today and haven't had any trouble. I will close the bug in a few days if it keeps working.

Revision history for this message
André Pirard (a.pirard) wrote :

Turning Visual Effects off (runnning metacity) worked around the problem indeed.
Trying to disable some effects did not.
My problem is that Jaunty is not out yet, is it.
And that I must return her PC to its owner.
Only here for a small jaunt, you know.
Not going to leave with a beta, should it.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Just had a solid lock-up now, while compiz and googleearth was running. It did not even react to sysrq. Nothing in syslog or Xorg.0.log.

Revision history for this message
André Pirard (a.pirard) wrote : Re: [Bug 295904] Re: [x700] looping solved on this PC

I apparently solved my (8.10) problem Tuesday by switching to ATI's
proprietary drivers. Should be xserver-xorg-video-radeon (lsmod showed
radeon) -> xserver-driver-fgvlrx (was installed), what I did is just
enable Proprietary in System|Administration|Hardware Drivers.
The PC ran Compiz for more than 24 hours until now without a hitch, and
that was testing all sorts of graphic games as a selection for Snow
White's 7 grand-children as I call her ;-)
Except that the interference-like problem I mentioned remains, as well
as, what I didn't mention before, messages saying that X could not
initialize; but just retry, it hasn't got a strong mind about that ;-)
And, more Wine windows management problems I thought were gone when
using Compiz.
So, somewhat unpolished working, but working.

Revision history for this message
André Pirard (a.pirard) wrote :

By "solved my problem", I meant lockups (still up to date 8.10 Gnome).
I continue to have weird problems I suspect being x700 and driver specific.
- the aforementioned interference-like Xorg initializations.
- for example when clicking "switch user" at a GDM user password prompt : it does nothing and, after typing the password, as many messages as clicks appear, saying Xorg could not initialize
- other similar cases like not switching to another users
- Celestia would stop initializing for any user with a message saying it issued an invalid X request; all users' Appearances were showing as if 'no effects' had been configured. The system was rebooted, all users' Appearances went back to Normal Effects and Celestia was on orbit again.

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Confirmed
Bryce Harrington (bryce)
summary: - [x700] server looping and system hang (DRILock+0xfa)
+ [X700] server looping and system hang (DRILock+0xfa)
Bryce Harrington (bryce)
tags: added: freeze
Bryce Harrington (bryce)
tags: added: intrepid
Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Tormod, do you see the system hang when using newer drivers? (Esp. the KMS PPA drivers) If so, might want to mention that on the upstream bug to help it get some attention.

Changed in xserver-xorg-video-ati (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

Tormod, also you should gather a full backtrace on the issue if you're still seeing it (http://wiki.ubuntu.com/X/Backtracing) - my guess is that this is why the upstream task has stagnated.

Revision history for this message
In , Bugzi09-fdo-tormod (bugzi09-fdo-tormod) wrote :

I have not seen these lock-ups for a while.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I think I haven't had these crashes for a while, maybe since radeon-rewrite. Especially DRI2 seems solid. There is already a backtrace in the upstream report, but I guess/hope I can close it.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → Fix Released
Changed in xserver-xorg-driver-ati:
status: Confirmed → Fix Released
Changed in xserver-xorg-driver-ati:
importance: Unknown → Medium
Changed in xserver-xorg-driver-ati:
importance: Medium → Unknown
Changed in xserver-xorg-driver-ati:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.