[ARM] pthread_cond_wait hang

Bug #884676 reported by Dr. David Alan Gilbert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-linaro (Ubuntu)
New
Undecided
Unassigned

Bug Description

The attached test program hangs on ARM, but works fine on x86.
It's based on the thread_init code in memcached that exhibits the same symptoms.

Tested on Panda-A1 (Dual core cortex-A9) in an Oneiric chroot:
ii libc6 2.13-20ubuntu5 Embedded GNU C Library: Shared libraries

main install is a Linaro 11.09 (Natty based filesystem), the kernel is

Linux panda-01 3.0.0-1404-linaro-lt-omap #8~ppa~natty-Ubuntu SMP PREEMPT Wed Sep 28 17:16:15 UTC 2011 armv7l armv7l armv7l GNU/Linux

It fails about 1/100ish times, so I run it with:
for SEQ in `seq 1 200`; do echo $SEQ; ./pthreadtest; done

gdb shows it's hanging trying to get the lock for the cond data:
gdb
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>.
(gdb) attach 2511
Attaching to process 2511
Reading symbols from /home/dg/pthreadtest...(no debugging symbols found)...done.
Reading symbols from /lib/arm-linux-gnueabi/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/arm-linux-gnueabi/libpthread-2.13.so...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x42c1c470 (LWP 2516)]
[New Thread 0x423bf470 (LWP 2515)]
[New Thread 0x41b3a470 (LWP 2514)]
[New Thread 0x412aa470 (LWP 2513)]
[New Thread 0x409fe470 (LWP 2512)]
done.
Loaded symbols for /lib/arm-linux-gnueabi/libpthread.so.0
Reading symbols from /lib/arm-linux-gnueabi/libc.so.6...Reading symbols from /usr/lib/debug/lib/arm-linux-gnueabi/libc-2.13.so...done.
done.
Loaded symbols for /lib/arm-linux-gnueabi/libc.so.6
Reading symbols from /lib/ld-linux.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.3
__libc_do_syscall ()
    at ../ports/sysdeps/unix/sysv/linux/arm/eabi/libc-do-syscall.S:46
46 ../ports/sysdeps/unix/sysv/linux/arm/eabi/libc-do-syscall.S: No such file or directory.
 in ../ports/sysdeps/unix/sysv/linux/arm/eabi/libc-do-syscall.S
(gdb) bt full
#0 __libc_do_syscall ()
    at ../ports/sysdeps/unix/sysv/linux/arm/eabi/libc-do-syscall.S:46
No locals.
#1 0x40081608 in __lll_lock_wait (futex=0x11088, private=0)
    at ../ports/sysdeps/unix/sysv/linux/arm/nptl/lowlevellock.c:47
        _a2tmp = 128
        _a2 = <optimized out>
        _nametmp = 240
        _a3tmp = 2
        _a3 = <optimized out>
        _a1 = <optimized out>
        _a4tmp = 0
        _a1tmp = 69768
        _a4 = <optimized out>
        _name = <optimized out>
        oldval = <optimized out>
#2 0x4007f58e in __pthread_cond_wait (cond=0x11088, mutex=0x11068)
    at pthread_cond_wait.c:159
        __futex = 0x11088
        futex_val = 1
        buffer = {__routine = 0x4007f261 <__condvar_cleanup>,
          __arg = 0xbe800bd8, __canceltype = 1119993776, __prev = 0x0}
        cbuffer = {oldtype = 0, cond = 0x11088, mutex = 0x11068, bc_seq = 0}
        err = <optimized out>
        pshared = 0
        val = <optimized out>
        seq = 0
#3 0x000087e4 in thread_init ()
No symbol table info available.
#4 0x00008694 in main ()
No symbol table info available.
(gdb) q
A debugging session is active.

 Inferior 1 [process 2511] will be detached.

Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :
Changed in eglibc (Ubuntu):
assignee: nobody → Dr. David Alan Gilbert (davidgil-uk)
Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

Thanks to dmart for pointing at https://lkml.org/lkml/2011/9/28/534 as the likely culprit;
seems to work in 3.1.0-1401-linaro-lt-omap

affects: eglibc (Ubuntu) → linux-linaro (Ubuntu)
Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

Given that this is a kernel issue, this just needs this test adding somewhere.

Run 5000 times in a loop; if it always completes then it's OK.

Dave

Changed in linux-linaro (Ubuntu):
assignee: Dr. David Alan Gilbert (davidgil-uk) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.