Comment 163 for bug 108527

Revision history for this message
Bryce Harrington (bryce) wrote : Re: X freezes when compiz is enabled on ATI cards

@luis, thanks for the backtrace; it is interesting that this does not match the backtrace posted by Matt (comment #52), although they're both ending with "#0 0xffffe410 in __kernel_vsyscall ()".

For everyone else, the #1 most important thing you can post to help with this specific bug is backtraces (see https://wiki.ubuntu.com/DebuggingXorg). lspci, Xorg.0.log, and xorg.conf don't give the info we need here (although it doesn't hurt to post them).

@luis, the -dbg and -dbgsym packages aren't the same thing, but for our purposes either will suit our needs, and -dbg are typically fine for Xorg. From your backtrace, all function calls are displayed so you've done the correct thing. It is not unknown for issues to change when a debug package is installed - sometimes this affects the timings and can delay things long enough for deadlocks to break, or race conditions to disappear. However those are rare, and I doubt will affect our debugging.

This particular backtrace is describing a fault during output flushing. I suspect this is a secondary issue (i.e., X failed on our unknown issue, then you connected and told it to keep going, and it did for a bit, but the original issue left the system in a bad state, so it crashed again later). So what we need is a backtrace for that original lockup.

@luis, please see if you can capture a few more backtraces, perhaps taking them in different ways (such as immediately after attaching, before doing cont).

I was sort of able to reproduce a lockup on an ATI R350 9800 using firefox 3, without Compiz enabled. In this case, I was able to switch to a virtual terminal (ctrl-alt-f1) and pkill firefox, and everything came back. I could reproduce that situation at will (seems to have something to do with ff3's URL autocomplete dropdown). After re-reading through every comment on this bug, I suspect that perhaps it is a client application triggering this bug (screensavers, games, office applications, proprietary software, compiz, etc. all are potential suspects). So a secondary troubleshooting approach after this problem occurs might be to go to a vt and kill apps, and see if that unlocks the system. If it does, then attaching to that process (just like in the DebuggingXorg procedure) and getting a backtrace may be of use.

Ultimately, what we're looking for is a common function call and/or invalid function parameters that result in jamming up the system. After that, we can then explore what in the driver is failing under those circumstances.

For those of you who haven't given up and gone back to 'doze and that are willing to help in this work of troubleshooting down deep, thanks ahead of time!