Comment 42 for bug 1863162

Revision history for this message
In , Xujing99 (xujing99) wrote :

(In reply to Szabolcs Nagy from comment #38)
> (In reply to xujing from comment #35)
> > (In reply to <email address hidden> from comment #31)
> > > commit 1387ad6225c2222f027790e3f460e31aa5dd2c54
> > > Author: Szabolcs Nagy <email address hidden>
> > > Date: Wed Dec 30 19:19:37 2020 +0000
> > >
> > > elf: Fix data races in pthread_create and TLS access [BZ #19329]
> > >
> > this patch use dl_load_lock in _dl_allocate_tls_init, is there a problem
> > when dlopen a dynamic library which will call pthread_create? I think it
> > will cause dl_load_lock and dl_load_lock dead lock.
>
> the real bug is that ctors are run with the dlopen lock held.
> that can causes deadlocks anyway (a ctor can create threads
> and that thread can call dlopen). this is bug 15686 which is not
> easy to fix, but that's the right solution. (in general, running
> user callbacks while libc internal locks are held is wrong.)
>
> that bug is now more exposed because the lock is also taken
> at _dl_allocate_tls_init during thread creation. however i
> expect that to be called in the parent thread only, so there
> should be no deadlock when ctor calls pthread_create, only
> when the child thread calls it again (which i considered rare).
>
> if you have example code that you think should work but now
> deadlocks, then please report it.

I'm sorry, I misled you. I think there is an ABBA deadlock issue in some scenarios.

If I have a c++ dynamic library(named libA.so) that contains a global object, the global object will call the post-constructor at initialization and hold it's own lock(named A_lock) when dlopen loads libA.so. Assume that two threads execute the following process:
    Thread1:dlopen(libA.so) => hold dl_load_lock => load libA.so => init global
            object from libA.so => wait for hold A_lock
    Thread2:my own code hold A_lock => pthread_create => _dl_allocate_tls_init
            => wait for hold dl_load_lock
In this case, an ABBA deadlock occurs. Is this a bug?

My stack looks like this:
Thread 1 (LWP 136013):
#0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0
#1 0x00007f57a107e4d1 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0
#1 stack waiting for holding A_lock
...
#6 0x00007f5781c1bb8b in LogProcess::Init (strProcName=..., nProcHandle=nProcHandle@entry=0) at ./service/biz_frame/code/server/src/logging/logprocess.cpp:107
...
#20 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6
#21 0x00007f57a786442b in ?? () from /lib64/ld-linux-x86-64.so.2
#22 0x00007f57a3de2296 in ?? () from /usr/lib64/libdl.so.2
#23 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6
#24 0x00007f57a0fef2af in _dl_catch_error () from /usr/lib64/libc.so.6
#25 0x00007f57a3de2985 in ?? () from /usr/lib64/libdl.so.2
#26 0x00007f57a3de2351 in dlopen () from /usr/lib64/libdl.so.2
...
...
#38 0x00007f57a0fb3520 in clone () from /usr/lib64/libc.so.6

Thread 2 (LWP 134627):
#0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0
#1 0x00007f57a107e580 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0
#2 0x00007f57a7863835 in _dl_allocate_tls_init () from /lib64/ld-linux-x86-64.so.2
#3 0x00007f57a107cb7c in pthread_create () from /usr/lib64/libpthread.so.0
...
#10 Stack holding A_lock
...
#14 0x0000561689e0d579 in main ()