Comment 11 for bug 1561621

Revision history for this message
In , Goffredo Baroncelli (kreijack) wrote :

I encountered this bug because mosh stopped to work after debian updated the libc to the 2.22 [1][2]. After few tests I discovered that the problem was related to a strange combination of switch and libs (see below).

The minimal test case to reproduce the problem is the following:

$ cat boom.c
extern void dofork();

int main() {
 dofork();
}

$ cat dofork.c
#include <unistd.h>

void dofork() {
 fork();
}

$ gcc -fPIC -c dofork.c
$ gcc -shared -Wl,-z,now -o libdofork.so dofork.o
$ gcc -o boom boom.c -lpthread -L$(pwd) -ldofork
$ LD_LIBRARY_PATH=$(pwd) ./boom
Segmentation fault

Expected result: the program doesn't have to crash
Result: the program crashes :-)

The fatal combination seems to be "-lpthread", "-Wl,-z,now" a call to fork() and the glibc-2.22. The crash happens near the fork.

The bug happened in mosh because:
- mosh is linked against libprotobuffer and libutempter
- mosh uses the "-Wl,-z,now" switch
- libprotobuffer via pkg-config suggests the -lpthread switch
- and libutempter uses the fork() function.
Together created the condition for the bug.

Looking at the commits between the 2.21 and 2.22 regarding nptl/pt-fork.c, I found the following one:

   commit beff1d132c16aedd87a3f1bc7b572c8e69819015
   Author: Roland McGrath <email address hidden>
   Date: Fri Feb 6 10:53:07 2015 -0800

        Clean up NPTL fork to be compat-only

Reverting it, the problem seems to disappear.

Florian Weimer, made some further investigation:

(gdb) break dofork
Breakpoint 1 at 0x4005b0
(gdb) r
Starting program: /home/fweimer/boom
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, 0x00007ffff79bd6d4 in dofork () from
/home/fweimer/libdofork.so
(gdb) disassemble
Dump of assembler code for function dofork:
   0x00007ffff79bd6d0 <+0>: push %rbp
   0x00007ffff79bd6d1 <+1>: mov %rsp,%rbp
=> 0x00007ffff79bd6d4 <+4>: callq 0x7ffff79bd5c0 <fork@plt>
   0x00007ffff79bd6d9 <+9>: nop
   0x00007ffff79bd6da <+10>: pop %rbp
   0x00007ffff79bd6db <+11>: retq
End of assembler dump.
(gdb) si
0x00007ffff79bd5c0 in fork@plt () from /home/fweimer/libdofork.so
(gdb) disassemble
Dump of assembler code for function fork@plt:
=> 0x00007ffff79bd5c0 <+0>: jmpq *0x200a0a(%rip) #
0x7ffff7bbdfd0 <email address hidden>
   0x00007ffff79bd5c6 <+6>: pushq $0x2
   0x00007ffff79bd5cb <+11>: jmpq 0x7ffff79bd590
End of assembler dump.
(gdb) print *(void **)0x7ffff7bbdfd0
$1 = (void *) 0x0
(gdb)

The commit beff1d132c16aedd87a3f1bc7b572c8e69819015,
assumes that __libc_fork has been relocated before the IFUNC resolver
for the libpthread fork definition runs, which is not always true.

Florian

----------------------------------
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=817929
[2] https://github.com/mobile-shell/mosh/issues/727