Comment 9 for bug 1409393

Revision history for this message
JVD (jason-vas-dias) wrote :

It just happened again, this time with latest 3.13.0-46-lowlatency kernel and 1:7.7+1ubuntu8.1 .
The messages are slightly different :

<quote>
<code>
[78988.537888] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
[78988.537900] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000007e3579e last fence id 0x0000000007e3579d on ring 0)
[78988.551392] [drm] Disabling audio 0 support
[78988.558663] radeon 0000:01:00.0: Saved 6100 dwords of commands on ring 0.
[78988.558689] radeon 0000:01:00.0: GPU softreset: 0x00000009
[78988.558692] radeon 0000:01:00.0: GRBM_STATUS = 0xA2733828
[78988.558697] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x1C000007
[78988.558703] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007
[78988.558706] radeon 0000:01:00.0: SRBM_STATUS = 0x20000AC0
[78988.558708] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
[78988.558711] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[78988.558714] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010800
[78988.558717] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00028006
[78988.558723] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647
[78988.558726] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[78988.572609] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
[78988.572663] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[78988.573833] radeon 0000:01:00.0: GRBM_STATUS = 0x00003828
[78988.573835] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007
[78988.573837] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007
[78988.573839] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0
[78988.573841] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
[78988.573842] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[78988.573844] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000
[78988.573846] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000
[78988.573848] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000
[78988.573850] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[78988.573860] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[78988.593310] [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
[78988.593389] radeon 0000:01:00.0: WB enabled
[78988.593390] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880401f83c00
[78988.593391] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff880401f83c0c
[78988.594158] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90005d1c418
[78988.610532] [drm] ring test on 0 succeeded in 1 usecs
[78988.610537] [drm] ring test on 3 succeeded in 2 usecs
[78988.808404] [drm] ring test on 5 succeeded in 1 usecs
[78988.808408] [drm] UVD initialized successfully.
[78988.808409] [drm] Enabling audio 0 support
[78988.808506] HDMI ATI/AMD: no speaker allocation for ELD
[78988.819165] [drm] ib test on ring 0 succeeded in 0 usecs
[78988.819188] [drm] ib test on ring 3 succeeded in 0 usecs
[78988.980723] [drm:uvd_v1_0_ib_test] *ERROR* radeon: failed to get create msg (-22).
[78988.980787] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-22).
[78989.109091] HDMI ATI/AMD: no speaker allocation for ELD
[78989.410240] HDMI ATI/AMD: no speaker allocation for ELD
</code>
</quote>
Once again, the Xorg server is frozen displaying the last frame, and
cannot be killed - one must SSH in to the machine from the network,
and do a 'poweroff' and press the power-on button to regain control
of the terminal - the VT-switch keystrokes (ALT+F[1-8]) do not work.
The gdb stack trace did not show as much info this time:
$ gdb -s /usr/lib/debug/usr/bin/Xorg -s /usr/lib/debug/.build-id/1a/a297280642f27cefdb283458289819239ae8b3.debug -s /usr/lib/debug/.build-id/4c/54eae2ae24e9a90fb22bdc4dcd9e07ee6a802c.debug /usr/bin/X -p 1810
GNU gdb (GDB) 7.8.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/X...(no debugging symbols found)...done.
Attaching to program: /usr/bin/X, process 1810
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007f68487b5337 in ?? ()
(gdb) t a a bt

Thread 1 (process 1810):
#0 0x00007f68487b5337 in ?? ()
#1 0x00007f6849a9c3a4 in ?? ()
#2 0x00007f684bd8e230 in ?? ()
#3 0x00007f684fa64d10 in ?? ()
#4 0x00007fff04864140 in ?? ()
#5 0x00007f684bd824f0 in ?? ()
#6 0x00007f684bd81e80 in ?? ()
#7 0x00007f6849a9e68e in ?? ()
#8 0x00007f684bd83b30 in ?? ()
#9 0x00007f6844600bb9 in ?? ()
#10 0x0000000000000632 in ?? ()
#11 0x00007f684bd824f0 in ?? ()
#12 0x00007f684bd81e80 in ?? ()
#13 0x00007f684fa64d10 in ?? ()
#14 0x0000000000000000 in ?? ()
(gdb) info reg
rax 0xfffffffffffffe00 -512
rbx 0xffffffff 4294967295
rcx 0xffffffffffffffff -1
rdx 0x7fff04864140 140733269295424
rsi 0x40086464 1074291812
rdi 0xc 12
rbp 0x7fff04864140 0x7fff04864140
rsp 0x7fff048640f8 0x7fff048640f8
r8 0xc 12
r9 0x237903000 9522130944
r10 0x1 1
r11 0x246 582
r12 0x40086464 1074291812
r13 0xc 12
r14 0x20 32
r15 0x7f68507a9ef0 140086003539696
rip 0x7f68487b5337 0x7f68487b5337
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)