Panda: possible recursive locking detected

Bug #1031336 reported by Botao
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro Ubuntu
Fix Released
Low
Unassigned
linaro-landing-team-ti
New
Undecided
Unassigned

Bug Description

For Panda 4460 board, observed on hardware pack:

http://snapshots.linaro.org/precise/hwpacks/lt-panda-x11-base/201/hwpack_linaro-lt-panda-x11-base_20120725-201_armhf_supported.tar.gz

with ubuntu rootfs image:

http://snapshots.linaro.org/precise/images/ubuntu-desktop/313/linaro-precise-ubuntu-desktop-20120725-313.tar.gz

During the system running, it hangs, crashes very frequently, especially related to Bluetooth. It's not stable for testing and using. Refer to attachment to get a part of serial console output.

##########################################################################################
On Panda 4430 board, this "DEADLOCK" error and "omap_hwmod: aess: _wait_target_disable failed" show on following images:

http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/242/lt-panda-x11-base-precise_ubuntu-desktop_20120812-242.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/256/lt-panda-x11-base-precise_ubuntu-desktop_20120819-256.img.gz
http://releases.linaro.org/12.08/ubuntu/leb-panda/lt-panda-x11-base_20120826-270-ubuntu-desktop.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/279/lt-panda-x11-base-precise_ubuntu-desktop_20120830-279.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/299/lt-panda-x11-base-precise_ubuntu-desktop_20120909-299.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/312/lt-panda-x11-base-precise_ubuntu-desktop_20120916-312.img.gz
https://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/329/lt-panda-x11-base-precise_ubuntu-desktop_20120924-329.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/361/lt-panda-x11-base-precise_ubuntu-desktop_20121007-361.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/375/lt-panda-x11-base-precise_ubuntu-desktop_20121014-375.img.gz
https://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/390/lt-panda-x11-base-precise_ubuntu-desktop_20121021-390.img.gz
https://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/419/lt-panda-x11-base-precise_ubuntu-desktop_20121105-419.img.gz
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/440/lt-panda-x11-base-precise_ubuntu-desktop_20121116-440.img.gz

Revision history for this message
Botao (botao-sun) wrote :
Revision history for this message
Botao (botao-sun) wrote :

After plug out & Plug in the power cable several times to reboot the board, it even can't boot to UI and serial console hangs too. Refer to attachment to get the latest console log output.

Paul Larson (pwlars)
summary: - ubuntu image on Panda board is not stable enough.
+ Panda frequent hangs and crashes: possible recursive locking detected
Revision history for this message
Botao (botao-sun) wrote : Re: Panda frequent hangs and crashes: possible recursive locking detected

For Panda 4460 board, observed on Linaro ubuntu pre-built image:

http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/219/lt-panda-x11-base-precise_ubuntu-desktop_20120803-219.img.gz

Please refer to attachment to get serial boot log.

Revision history for this message
Botao (botao-sun) wrote :

The dead lock error in serial boot log is also observed on Panda 4430 board, same pre-built image as Panda 4460.

Revision history for this message
Amit Khare (amit-khare) wrote :

on panda ubuntu build observed
http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/242/lt-panda-x11-base-precise_ubuntu-desktop_20120812-242.img.gz

while booting i see this log on minicom

root@linaro-ubuntu-desktop:~# [ 40.347076] eth0: no IPv6 routers present
[ 42.315216] omap_hwmod: aess: _wait_target_disable failed
[ 42.324707] omap_hwmod: aess: _wait_target_disable failed
[ 42.334045] omap_hwmod: aess: _wait_target_disable failed
[ 42.344604] omap_hwmod: aess: _wait_target_disable failed
[ 42.353759] omap_hwmod: aess: _wait_target_disable failed
[ 42.362304] omap_hwmod: aess: _wait_target_disable failed
[ 42.370788] omap_hwmod: aess: _wait_target_disable failed
[ 42.379272] omap_hwmod: aess: _wait_target_disable failed
[ 42.387847] omap_hwmod: aess: _wait_target_disable failed
[ 42.396392] omap_hwmod: aess: _wait_target_disable failed
[ 42.406219] omap_hwmod: aess: _wait_target_disable faile

Revision history for this message
Botao (botao-sun) wrote :

@Amit Khare, would you add full boot log in serial console as attachment? That will be more helpful to all the others. Thank you.

Revision history for this message
Amit Khare (amit-khare) wrote :
Revision history for this message
Botao (botao-sun) wrote :
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

With http://snapshots.linaro.org/precise/images/ubuntu-desktop/376/linaro-precise-ubuntu-desktop-20120826-376.tar.gz + http://snapshots.linaro.org/precise/hwpacks/lt-panda-x11-base/270/hwpack_linaro-lt-panda-x11-base_20120826-270_armhf_supported.tar.gz the error described by comment #5 is gone, but the recursive loop is still happening.

Here the recursive loop is not necessarily an issue, and part of the SGX kernel module (which only happens when the kernel has CONFIG_DEBUG_SPINLOCK=y).

Changed in linaro-ubuntu:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Botao (botao-sun) wrote :
Download full text (3.7 KiB)

Confirmed. For Panda 4460, observed dead lock on pre-built image:

http://releases.linaro.org/12.08/ubuntu/leb-panda/lt-panda-x11-base_20120826-270-ubuntu-desktop.img.gz

However, this wouldn't hang the board.

[ 21.546112] =============================================
[ 21.551818] [ INFO: possible recursive locking detected ]
[ 21.557525] 3.4.0-2-linaro-lt-omap #2~ci+120825182553-Ubuntu Tainted: G C O
[ 21.565612] ---------------------------------------------
[ 21.571258] Xorg/2058 is trying to acquire lock:
[ 21.576110] (psPVRSRVMutex){+.+.+.}, at: [<bf9ea3ff>] LinuxLockMutex+0xe/0x10 [omapdrm_pvr]
[ 21.585021]
[ 21.585021] but task is already holding lock:
[ 21.591156] (psPVRSRVMutex){+.+.+.}, at: [<bf9ea3ff>] LinuxLockMutex+0xe/0x10 [omapdrm_pvr]
[ 21.600036]
[ 21.600036] other info that might help us debug this:
[ 21.606903] Possible unsafe locking scenario:
[ 21.606903]
[ 21.613128] CPU0
[ 21.615692] ----
[ 21.618225] lock(psPVRSRVMutex);
[ 21.621826] lock(psPVRSRVMutex);
[ 21.625396]
[ 21.625396] *** DEADLOCK ***
[ 21.625396]
[ 21.631622] May be due to missing lock nesting notation
[ 21.631622]
[ 21.638732] 2 locks held by Xorg/2058:
[ 21.642669] #0: (drm_global_mutex){+.+.+.}, at: [<c02a9e21>] drm_release+0x21/0x288
[ 21.650909] #1: (psPVRSRVMutex){+.+.+.}, at: [<bf9ea3ff>] LinuxLockMutex+0xe/0x10 [omapdrm_pvr]
[ 21.660278]
[ 21.660278] stack backtrace:
[ 21.664886] [<c0011011>] (unwind_backtrace+0x1/0x90) from [<c005b3bd>] (print_deadlock_bug+0x81/0xac)
[ 21.674560] [<c005b3bd>] (print_deadlock_bug+0x81/0xac) from [<c005c409>] (validate_chain.isra.26+0x341/0x360)
[ 21.685058] [<c005c409>] (validate_chain.isra.26+0x341/0x360) from [<c005cd2f>] (__lock_acquire+0x4d5/0x54e)
[ 21.695373] [<c005cd2f>] (__lock_acquire+0x4d5/0x54e) from [<c005d19f>] (lock_acquire+0xbb/0xd8)
[ 21.704620] [<c005d19f>] (lock_acquire+0xbb/0xd8) from [<c046804f>] (mutex_lock_nested+0x37/0x250)
[ 21.714050] [<c046804f>] (mutex_lock_nested+0x37/0x250) from [<bf9ea3ff>] (LinuxLockMutex+0xe/0x10 [omapdrm_pvr])
[ 21.724853] [<bf9ea3ff>] (LinuxLockMutex+0xe/0x10 [omapdrm_pvr]) from [<bf9e8ec9>] (LinuxMMapPerProcessDisconnect+0x16/0x4e [omapdrm_pvr])
[ 21.737915] [<bf9e8ec9>] (LinuxMMapPerProcessDisconnect+0x16/0x4e [omapdrm_pvr]) from [<bf9ea6f3>] (OSPerProcessPrivateDataDeInit+0x10/0x26 [omapdrm_pvr])
[ 21.752441] [<bf9ea6f3>] (OSPerProcessPrivateDataDeInit+0x10/0x26 [omapdrm_pvr]) from [<bf9eeff3>] (PVRSRVDissociateMemFromResmanKM+0x58/0x76 [omapdrm_pvr])
[ 21.767181] [<bf9eeff3>] (PVRSRVDissociateMemFromResmanKM+0x58/0x76 [omapdrm_pvr]) from [<bf9ef113>] (PVRSRVPerProcessDataDisconnect+0x2a/0x3c [omapdrm_pvr])
[ 21.781982] [<bf9ef113>] (PVRSRVPerProcessDataDisconnect+0x2a/0x3c [omapdrm_pvr]) from [<bf9f021f>] (PVRSRVProcessDisconnect+0xc/0xe [omapdrm_pvr])
[ 21.795867] [<bf9f021f>] (PVRSRVProcessDisconnect+0xc/0xe [omapdrm_pvr]) from [<bf9e9195>] (PVRSRVRelease+0x5c/0x8c [omapdrm_pvr])
[ 21.808227] [<bf9e9195>] (PVRSRVRelease+0x5c/0x8c [omapdrm_pvr]) from [<bf9f901f>] (PVRSRVDrmRelease+0x1a/0x2c [omapdrm_pvr])
[ 21.820098] [<bf9f901f>] (PVRSRV...

Read more...

Revision history for this message
Botao (botao-sun) wrote :

For Panda 4430 board with Linaro ubuntu pre-built image:

http://releases.linaro.org/12.08/ubuntu/leb-panda/lt-panda-x11-base_20120826-270-ubuntu-desktop.img.gz

Except that dead lock error, also observed the issue which mentioned in comment #5:

omap_hwmod: aess: _wait_target_disable failed

However, this error doesn't exist on Panda 4460 board.

Revision history for this message
Paul Larson (pwlars) wrote :

Confirmed with Botao that he no longer sees hangs related to this, but the error still occurs. Revised title and severity

summary: - Panda frequent hangs and crashes: possible recursive locking detected
+ Panda: possible recursive locking detected
Changed in linaro-ubuntu:
importance: High → Low
Revision history for this message
Amit Khare (amit-khare) wrote :
Revision history for this message
Botao (botao-sun) wrote :

For Panda 4460 board, observed on Linaro ubuntu pre-built image:

http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/279/lt-panda-x11-base-precise_ubuntu-desktop_20120830-279.img.gz

Please refer to attachment to get full boot log.

Revision history for this message
Botao (botao-sun) wrote :

For Panda 4430 board, there is no "DEADLOCK" error any more, observed on Linaro pre-built image:

http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/299/lt-panda-x11-base-precise_ubuntu-desktop_20120909-299.img.gz

However, there are 2 major errors:

1. kernel NULL pointer;

2. omap_hwmod: aess: _wait_target_disable failed.

Revision history for this message
Botao (botao-sun) wrote :

Here is detailed log output for comment #15.

Revision history for this message
Botao (botao-sun) wrote :

"DEADLOCK" error shows up again, with another error:

omap_hwmod: aess: _wait_target_disable failed

This issue is observed on Linaro ubuntu pre-built image:

http://snapshots.linaro.org/precise/pre-built/lt-panda-x11-base/312/lt-panda-x11-base-precise_ubuntu-desktop_20120916-312.img.gz

Please refer to attachment to find boot log.

Botao (botao-sun)
description: updated
Botao (botao-sun)
description: updated
Botao (botao-sun)
description: updated
description: updated
description: updated
description: updated
Revision history for this message
Soumya Basak (soumya-basak) wrote :
Revision history for this message
Botao (botao-sun) wrote :

This issue can't be observed on the latest ubuntu raring image. Status of this bug will be changed to "Fix Released".

Changed in linaro-ubuntu:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.