evdev mutex inconsistent lock state

Bug #1037565 reported by Tixy (Jon Medhurst)
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Android
Fix Released
High
Tixy (Jon Medhurst)

Bug Description

I've been fairly regularly seeing "inconsistent lock state" backtraces when running Android on vexpress TC2. See attached log.

This looks like a problem with how the evdev driver uses a mutex.

Revision history for this message
Tixy (Jon Medhurst) (tixy) wrote :
Changed in linaro-landing-team-arm:
assignee: nobody → Tixy (Jon Medhurst) (tixy)
importance: Undecided → High
status: New → Triaged
Changed in linaro-landing-team-arm:
status: Triaged → In Progress
Zach Pfeffer (pfefferz)
Changed in linaro-android:
milestone: none → 12.09
Revision history for this message
Tixy (Jon Medhurst) (tixy) wrote :

This bug is caused by the Android patch called " Input: evdev - Add ioctl to block suspend while event queue is not empty."

This is broken because evdev_enable_suspend_block() has

 spin_lock_irq(&client->buffer_lock);
 wake_lock_init(&client->wake_lock, WAKE_LOCK_SUSPEND, client->name);
 [...]
 spin_unlock_irq(&client->buffer_lock);

which means that wake_lock_init() is called with interrupts disabled. However wake_lock_init() calls wakeup_source_add() which itself uses spinlocks like...

 spin_lock_irq(&events_lock);
 [...]
 spin_unlock_irq(&events_lock);

which assumes that it is called with interrupts enabled. As a consequence, the spin_unlock_irq() above unconditionally enables irqs which means the evdev code which called it is now holding the client->buffer_lock with interrupts enabled, opening up the opportunity for deadlock to occur if an interrupt happens for an event which then calls evdev_pass_event() and then waits on the same spinlock.

Zach Pfeffer (pfefferz)
Changed in linaro-android:
importance: Undecided → High
assignee: nobody → Tixy (Jon Medhurst) (tixy)
Revision history for this message
Botao (botao-sun) wrote :

For vexpress A9 board, observed on Linaro Jelly Bean image:

https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-gcc47-armlt-tracking-open/#build=30

Please refer to attachment to get full boot log.

Revision history for this message
John Stultz (jstultz) wrote :

Per Tixy's debugging, I've generated a patch to use irqsave/restore when accessing the events_lock.
This seems to me to be the right fix, but I've sent it to the PM maintainer for comment before wider submission.

If this is right, we will need to make sure this gets backported to AOSPs common.git/android-3.4 and submitted to Gerrit.

Anmar Oueja (anmar)
Changed in linaro-landing-team-arm:
milestone: none → 2012.09
Revision history for this message
John Stultz (jstultz) wrote :

Patch sent to lkml and pushed out via linaro-android-3.6-jstultz-rebase

Once the patch is merged upstream, I'll send it on to AOSP.

Revision history for this message
Tixy (Jon Medhurst) (tixy) wrote :
Changed in linaro-landing-team-arm:
status: In Progress → Fix Committed
Revision history for this message
Amit Khare (amit-khare) wrote :

Observed with android vexpress JB build
https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-gcc47-armlt-tracking-open/#build=37

LOGS:

[ 263.224205] CPU0
[ 263.231518] ----
[ 263.238827] lock(&(&client->buffer_lock)->rlock);
[ 263.253461] <Interrupt>
[ 263.261294] lock(&(&client->buffer_lock)->rlock);
[ 263.276449]
[ 263.276449] *** DEADLOCK ***
[ 263.276449]
[ 263.294192] 2 locks held by InputReader/2014:
[ 263.307237] #0: (&evdev->mutex){+.+...}, at: [<c03aadc4>] evdev_ioctl+0x2c/0x91c
[ 263.330005] #1: (&(&client->buffer_lock)->rlock){?.....}, at: [<c03aafe4>] evdev_ioctl+0x24c/0x91c
[ 263.357450]
[ 263.357450] stack backtrace:
[ 263.370533] [<c0015058>] (unwind_backtrace+0x0/0xf8) from [<c04fce78>] (print_usage_bug+0x248/0x2b0)
[ 263.397929] [<c04fce78>] (print_usage_bug+0x248/0x2b0) from [<c0071c70>] (mark_lock+0x2a0/0x67c)
[ 263.424276] [<c0071c70>] (mark_lock+0x2a0/0x67c) from [<c007468c>] (mark_held_locks+0x5c/0x124)
[ 263.450361] [<c007468c>] (mark_held_locks+0x5c/0x124) from [<c00747fc>] (trace_hardirqs_on_caller+0xa8/0x190)
[ 263.480096] [<c00747fc>] (trace_hardirqs_on_caller+0xa8/0x190) from [<c05052e4>] (_raw_spin_unlock_irq+0x24/0x2c)
[ 263.510875] [<c05052e4>] (_raw_spin_unlock_irq+0x24/0x2c) from [<c03aaff8>] (evdev_ioctl+0x260/0x91c)
[ 263.538529] [<c03aaff8>] (evdev_ioctl+0x260/0x91c) from [<c00f4678>] (do_vfs_ioctl+0x80/0x538)
[ 263.564352] [<c00f4678>] (do_vfs_ioctl+0x80/0x538) from [<c00f4b64>] (sys_ioctl+0x34/0x60)
[ 263.589133] [<c00f4b64>] (sys_ioctl+0x34/0x60) from [<c000e000>] (ret_fast_syscall+0x0/0x3c)
[ 362.131513] init: sys_prop: permission denied uid:1003 name:service.bootanim.exit
[ 476.986679] mmcblk0: error -5 transferring data, sector 617064, nr 24, cmd response 0x900, card status 0xb00
[ 477.016346] mmcblk0: retrying using single block transfer
[ 484.922577] mmcblk0: error -5 transferring data, sector 1979938, nr 30, cmd response 0x900, card status 0xc00
[ 484.952583] mmcblk0: retrying using single block transfer
[ 486.646280] mmcblk0: error -5 transferring data, sector 624360, nr 8, cmd response 0x900, card status 0xb00
[ 486.675675] mmcblk0: retrying using single block transfer

Revision history for this message
John Stultz (jstultz) wrote :

Amit: Has build37 included the fix above?

Revision history for this message
Amit Khare (amit-khare) wrote :

John: Looking at the logs it seems the fix is not included with this build. I see similar out put on build 37 as in comment #3 by Botao

Revision history for this message
John Stultz (jstultz) wrote :

Colin Cross just pulled this fix into AOSP for Android's offical 3.4 tree.

Revision history for this message
Amit Khare (amit-khare) wrote :

Observed with android vexpress JB build
https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-gcc47-armlt-tracking-open/#build=42

I dont see the DEADLOCK issue now but lot of sdcard related issues are there.

LOGS:

[ 21.851704] mmcblk0: retrying using single block transfer
[ 22.574259] init: cannot find '/system/bin/rild', disabling 'ril-daemon'
[ 22.686519] usb 1-1.2: new full-speed USB device number 47 using isp1760
[ 22.706855] mmcblk0: error -5 transferring data, sector 782368, nr 24, cmd response 0x900, card status 0xb00
[ 22.736382] mmcblk0: retrying using single block transfer
[ 22.786593] usb 1-1.2: device descriptor read/64, error -32
[ 22.986556] usb 1-1.2: device descriptor read/64, error -32
[ 23.003532] mmcblk0: error -5 transferring data, sector 754920, nr 24, cmd response 0x900, card status 0xb00
[ 23.033079] mmcblk0: retrying using single block transfer
[ 23.186641] usb 1-1.2: new full-speed USB device number 48 using isp1760
[ 23.207079] mmcblk0: error -5 transferring data, sector 751520, nr 24, cmd response 0x900, card status 0xb00
[ 23.236649] mmcblk0: retrying using single block transfer
[ 23.326558] usb 1-1.2: device descriptor read/64, error -32

Changed in linaro-android:
status: New → Fix Committed
Revision history for this message
Botao (botao-sun) wrote :

@Amit, those SD card errors are expected. This bug is used for tracking that "DEADLOCK" issue and now it has been solved.

Changed in linaro-android:
status: Fix Committed → Fix Released
Changed in linaro-landing-team-arm:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.