Bug #968265 “Resuming from sleep leaves Xorg using 100% CPU and ...” : Bugs : fglrx-installer package : Ubuntu

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-03-29:

#1

Dependencies.txt Edit (3.6 KiB, text/plain; charset="utf-8")

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-03-29:

#2

Xorg.0.log.old Edit (50.8 KiB, application/x-trash)

The Xorg log when this happens, doesn't seem to contain anything interesting.

Revision history for this message

Bryce Harrington (bryce) wrote on 2012-04-04:

#3

Often with these 100% X cpu bugs, the problem is caused by a client application, which is stuck in a loop making X requests. In such cases, the debugging procedure is to examine your process table (e.g. `ps aux`) and start killing processes one by one until the system unfreezes.

However, the fact that this occurs on resume makes this bug sound a bit different. It can't hurt to try the above, and it might turn up something (I'd probably test killing gnome-settings-daemon, compiz/unity and some of the other gnome infrastructural bits first).

If all that fails, you can connect to the running X process using gdb and gather a series of backtraces to see what series of routines it is hitting. strace can be used here too, although with the X server it produces so much output it's often unusable for diagnostics.

Changed in fglrx-installer (Ubuntu):
status:	New → Incomplete

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-04-11:

#4

Download full text (8.0 KiB)

Hello!

I tried killing some processes but it didn't help. I also attatched gdb to the Xorg process but gdb just hanged and wouldn't produce a backtrace, even if I ran "killall Xorg" or "killall -9 Xorg" from another ssh session. It seems impossible to stop the Xorg process once it gets into this state.

Anyway, I just discovered an error in kern.log. It looks similar to bug #881526. That bug isn't about suspend/resume issues but I guess it could be the same problem. I also guess there isn't much to do about this then except waiting for AMD to fix it. :(

Apr 11 18:40:38 simon-305U1A kernel: [ 416.884015] [fglrx] ASIC hang happened
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884029] Pid: 1049, comm: Xorg Tainted: P O 3.2.0-23-generic #36-Ubuntu
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884033] Call Trace:
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884147] [<ffffffffa00d4ffe>] KCL_DEBUG_OsDump+0xe/0x10 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884196] [<ffffffffa00e258c>] firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884281] [<ffffffffa017e119>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884360] [<ffffffffa017e0bc>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x9c/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884440] [<ffffffffa01813cc>] ? _ZN8AsicR60012IO_QuietdownEv+0x2c/0x40 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884520] [<ffffffffa0178bd4>] ? _ZN15ExecutableUnits10CPRingIdleE15idle_WaitMethod12_QS_CP_RING_+0x134/0x1e0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884600] [<ffffffffa0178a4c>] ? _ZN15ExecutableUnits7PM4idleE15idle_WaitMethod+0x4c/0x90 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884678] [<ffffffffa01785b6>] ? _ZN15ExecutableUnits9assertPM4Eb+0x56/0x70 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884756] [<ffffffffa0182959>] ? _ZN8AsicR6009assertPM4Eb+0x39/0x80 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884809] [<ffffffffa0101550>] ? firegl_cmmqs_disabledriver+0xf0/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884876] [<ffffffffa0150d65>] ? CMMQS_ReinitializeHardware+0x75/0xd0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884929] [<ffffffffa010266b>] ? firegl_cmmqs_Enable_QS+0xbb/0x160 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884938] [<ffffffff81073537>] ? capable+0x17/0x20
Apr 11 18:40:38 simon-305U1A kernel: [ 416.884990] [<ffffffffa0101562>] ? firegl_cmmqs_enableqs+0x12/0x70 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885041] [<ffffffffa0101550>] ? firegl_cmmqs_disabledriver+0xf0/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885088] [<ffffffffa00de12d>] ? firegl_ioctl+0x1ed/0x250 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885130] [<ffffffffa00ce9be>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885137] [<ffffffff81189cfa>] ? do_vfs_ioctl+0x8a/0x340
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885143] [<ffffffff81659f0c>] ? __schedule+0x3cc/0x6f0
Apr 11 18:40:38 simon-305U1A kernel: [ 416.885149] [<ffffffff8118a041>] ? sys_ioctl+0x91/...

Hello!

I tried killing some processes but it didn't help. I also attatched gdb to the Xorg process but gdb just hanged and wouldn't produce a backtrace, even if I ran "killall Xorg" or "killall -9 Xorg" from another ssh session. It seems impossible to stop the Xorg process once it gets into this state.

Anyway, I just discovered an error in kern.log. It looks similar to bug #881526. That bug isn't about suspend/resume issues but I guess it could be the same problem. I also guess there isn't much to do about this then except waiting for AMD to fix it. :(

Apr 11 18:40:38 simon-305U1A kernel: [  416.884015] [fglrx] ASIC hang happened
Apr 11 18:40:38 simon-305U1A kernel: [  416.884029] Pid: 1049, comm: Xorg Tainted: P           O 3.2.0-23-generic #36-Ubuntu
Apr 11 18:40:38 simon-305U1A kernel: [  416.884033] Call Trace:
Apr 11 18:40:38 simon-305U1A kernel: [  416.884147]  [<ffffffffa00d4ffe>] KCL_DEBUG_OsDump+0xe/0x10 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884196]  [<ffffffffa00e258c>] firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884281]  [<ffffffffa017e119>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884360]  [<ffffffffa017e0bc>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x9c/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884440]  [<ffffffffa01813cc>] ? _ZN8AsicR60012IO_QuietdownEv+0x2c/0x40 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884520]  [<ffffffffa0178bd4>] ? _ZN15ExecutableUnits10CPRingIdleE15idle_WaitMethod12_QS_CP_RING_+0x134/0x1e0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884600]  [<ffffffffa0178a4c>] ? _ZN15ExecutableUnits7PM4idleE15idle_WaitMethod+0x4c/0x90 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884678]  [<ffffffffa01785b6>] ? _ZN15ExecutableUnits9assertPM4Eb+0x56/0x70 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884756]  [<ffffffffa0182959>] ? _ZN8AsicR6009assertPM4Eb+0x39/0x80 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884809]  [<ffffffffa0101550>] ? firegl_cmmqs_disabledriver+0xf0/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884876]  [<ffffffffa0150d65>] ? CMMQS_ReinitializeHardware+0x75/0xd0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884929]  [<ffffffffa010266b>] ? firegl_cmmqs_Enable_QS+0xbb/0x160 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.884938]  [<ffffffff81073537>] ? capable+0x17/0x20
Apr 11 18:40:38 simon-305U1A kernel: [  416.884990]  [<ffffffffa0101562>] ? firegl_cmmqs_enableqs+0x12/0x70 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.885041]  [<ffffffffa0101550>] ? firegl_cmmqs_disabledriver+0xf0/0xf0 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.885088]  [<ffffffffa00de12d>] ? firegl_ioctl+0x1ed/0x250 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.885130]  [<ffffffffa00ce9be>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]
Apr 11 18:40:38 simon-305U1A kernel: [  416.885137]  [<ffffffff81189cfa>] ? do_vfs_ioctl+0x8a/0x340
Apr 11 18:40:38 simon-305U1A kernel: [  416.885143]  [<ffffffff81659f0c>] ? __schedule+0x3cc/0x6f0
Apr 11 18:40:38 simon-305U1A kernel: [  416.885149]  [<ffffffff8118a041>] ? sys_ioctl+0x91/0xa0
Apr 11 18:40:38 simon-305U1A kernel: [  416.885156]  [<ffffffff81664a82>] ? system_call_fastpath+0x16/0x1b
Apr 11 18:40:38 simon-305U1A kernel: [  416.885164] pubdev:0xffffffffa037c320, num of device:1 , name:fglrx, major 8, minor 96. 
Apr 11 18:40:38 simon-305U1A kernel: [  416.885170] device 0 : 0xffff880135f90000 .
Apr 11 18:40:38 simon-305U1A kernel: [  416.885176] Asic ID:0x9802, revision:0x23, MMIOReg:0xffffc90012480000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885182] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x10000000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885188] gart table MC:0xf0fc55000, Physical:0xcfc55000, size:0x39e000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885194] mc_node :FB, total 1 zones
Apr 11 18:40:38 simon-305U1A kernel: [  416.885198]     MC start:0xf00000000, Physical:0xc0000000, size:0x10000000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885204]     Mapped heap -- Offset:0x0, size:0xfc55000, reference count:27, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885210]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885216]     Mapped heap -- Offset:0xfc55000, size:0x39f000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885222]     Mapped heap -- Offset:0xfff4000, size:0xc000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885226] mc_node :GART_USWC, total 3 zones
Apr 11 18:40:38 simon-305U1A kernel: [  416.885230]     MC start:0x3b690000, Physical:0x0, size:0x48400000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885235]     Mapped heap -- Offset:0x2030000, size:0x800000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885241]     Mapped heap -- Offset:0x30000, size:0x2000000, reference count:12, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885246] mc_node :GART_CACHEABLE, total 3 zones
Apr 11 18:40:38 simon-305U1A kernel: [  416.885250]     MC start:0x10400000, Physical:0x0, size:0x2b290000.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885255]     Mapped heap -- Offset:0x4800000, size:0x500000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885261]     Mapped heap -- Offset:0x4300000, size:0x500000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885267]     Mapped heap -- Offset:0x3e00000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885272]     Mapped heap -- Offset:0x2f00000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885278]     Mapped heap -- Offset:0x3400000, size:0x500000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885283]     Mapped heap -- Offset:0x3900000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885289]     Mapped heap -- Offset:0x2a00000, size:0x500000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885294]     Mapped heap -- Offset:0x2500000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885300]     Mapped heap -- Offset:0x1100000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885306]     Mapped heap -- Offset:0xc00000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885311]     Mapped heap -- Offset:0x2000000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885317]     Mapped heap -- Offset:0x1b00000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885322]     Mapped heap -- Offset:0x1600000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885328]     Mapped heap -- Offset:0x700000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885334]     Mapped heap -- Offset:0x200000, size:0x500000, reference count:2, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885339]     Mapped heap -- Offset:0x0, size:0x200000, reference count:8, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885345]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Apr 11 18:40:38 simon-305U1A kernel: [  416.885352] GRBM : 0xa0003828, SRBM : 0x20008040 .
Apr 11 18:40:38 simon-305U1A kernel: [  416.885357] CP_RB_BASE : 0x3b6c00, CP_RB_RPTR : 0x10 , CP_RB_WPTR :0x10.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885363] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x0.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885367] last submit IB buffer -- MC :0x0. Can't found mapped physical page for this MC .
Apr 11 18:40:38 simon-305U1A kernel: [  416.885373] Dump the trace queue.
Apr 11 18:40:38 simon-305U1A kernel: [  416.885377] End of dump

Revision history for this message

Bryce Harrington (bryce) wrote on 2012-04-17:

#5

Good find, although the stack traces don't 100% match up. Also sometimes with freeze bugs, crash dumps can be misleading. Still, worth mentioning on the other bug report (which I've done.)

More importantly, the trace proves this is a gpu lockup, not an application issue as originally suspected.

@aberto, please forward to ATI.

Changed in fglrx-installer (Ubuntu):
assignee:	nobody → Alberto Milone (albertomilone)
importance:	Undecided → High
status:	Incomplete → Confirmed

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-06-25:

#6

This might have been fixed in fglrx 12.6 beta! My laptop hasn't crashed yet after a few sleep-resume cycles.

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-08-19:

#7

I can confirm that the problem is fixed in fglrx 12.6 or later. My computer hasn't crashed even once on suspend/resume since updating. Though support for my hardwhare was briefly dropped in 12.6 so 12.8 is the first version that's stable without having the annoying "Unsupported hardware" watermark.

It would be nice if this version could be added to 12.10 before the feature freeze!

Revision history for this message

penalvch (penalvch) wrote on 2012-10-19:

#8

Simon Strandman, thank you for reporting this and helping make Ubuntu better. Could you please execute the following via a terminal:
apport-collect -p linux 968265

As well, using the fglrx-installer provided by Ubuntu, could you please provide the information following https://wiki.ubuntu.com/DebuggingKernelSuspend ?

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: high-spu
tags:	added: high-cpu removed: high-spu

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-11-05:

#9

@Christopher

Hello. This problem is fixed in fglrx 12.8 and later so I don't think there is a need to debug it now. Howeer, it would be great if a newer version of fglrx could be backported to 12.04.

I'm using Ubuntu 12.10 now btw so I don't have the problem any more.

Simon

Revision history for this message

penalvch (penalvch) wrote on 2012-11-05:

#10

Simon Strandman, thank you for your comments. Regarding them :
>"I'm using Ubuntu 12.10 now btw so I don't have the problem any more."

In 12.10, are you using the version of fglrx-installer that comes in the Ubuntu repositories or 12.8 downloaded from amd.com ?

description:

updated

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-11-06:

#11

I'm using the one from the repos (9.000-0ubuntu3) now. I also tried 12.11 beta from amd.com and it works too! I don't think 12.8 works on quantal due to the newer xserver but when I used precise I got 12.8 from amd.com.

Revision history for this message

penalvch (penalvch) wrote on 2012-11-06:

#12

Simon Strandman, thank you for providing the requested information. Since you noted fglrx-installer from the Ubuntu repositories works for you in Quantal, did you need a backport of the fix to a release prior to Quantal, or may we close this as Status Invalid?

Revision history for this message

Simon Strandman (nejsimon) wrote on 2012-11-06:

#13

For me it's fine if this bug is closed since my issue is solved. But I guess other might have the same issue? There is a very similar bug (#881526) btw and they might also be helped by a fglrx backport. But feel free to close this one!

Revision history for this message

penalvch (penalvch) wrote on 2012-11-06:

#14

Simon Strandman, this bug report is being closed due to your last comment https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/968265/comments/13 regarding this being fixed for you in Quantal. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time to report this bug and helping to make Ubuntu better. Please submit any future bugs you may find.

no longer affects:	linux (Ubuntu)
Changed in fglrx-installer (Ubuntu):
assignee:	Alberto Milone (albertomilone) → nobody
status:	Confirmed → Invalid

Ubuntu
fglrx-installer package

Resuming from sleep leaves Xorg using 100% CPU and unable to turn on the screen

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntufglrx-installer package

Resuming from sleep leaves Xorg using 100% CPU and unable to turn on the screen

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
fglrx-installer package