Bug #1266492 “ld:i386 crashes with -static -fPIE -pie” : Trusty (14.04) : Bugs : evolution-data-server package : Ubuntu

Revision history for this message

In Sourceware.org Bugzilla #16159, Darryl L. Miles (darryl-miles) wrote on 2013-11-13:

#5

malloc_printerr() on error detection "free(): invalid next size (fast)" ends up calling into:

backtrace.c:init()
dl-libc.c:do_dlopen()
malloc.c:calloc()
malloc.c:malloc_printerr()

The malloc error reporting should only report the first error, not attempt to recusively report all error (we knew it was corrupted at the outer most point, so any further work inside malloc is also likely to find corruption).

Full stack trace to follow.

The main problem is the process does not abort() and die, it hangs around in:

pthread_once.S:pthread_one()
backtrace.c:__backtrace()

I think due to recursive lock, this lock should trylock() on the 2nd time and abort() the process immediately. It does appear to deadlock itself.

Revision history for this message

In Sourceware.org Bugzilla #16159, Darryl L. Miles (darryl-miles) wrote on 2013-11-13:

#6

(gdb) bt
#0 pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:95
#1 0x00007f8dfb540994 in __backtrace (array=<value optimized out>, size=64) at ../sysdeps/ia64/backtrace.c:85
#2 0x00007f8dfb4b280b in __libc_message (do_abort=2, fmt=0x7f8dfb599fc0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178
#3 0x00007f8dfb4b8126 in malloc_printerr (action=3, str=0x7f8dfb5980f9 "malloc(): memory corruption", ptr=<value optimized out>) at malloc.c:6311
#4 0x00007f8dfb4bbba4 in _int_malloc (av=0x7f8dfb7d0e80, bytes=<value optimized out>) at malloc.c:4411
#5 0x00007f8dfb4bc5e6 in __libc_calloc (n=<value optimized out>, elem_size=<value optimized out>) at malloc.c:4075
#6 0x00007f8dfda14d1f in _dl_new_object (realname=0x247de20 "/lib64/libgcc_s.so.1", libname=0x7f8dfb596e3e "libgcc_s.so.1", type=2, loader=0x0, mode=-1879048191, nsid=0) at dl-object.c:77
#7 0x00007f8dfda111ae in _dl_map_object_from_fd (name=0x7f8dfb596e3e "libgcc_s.so.1", fd=6, fbp=0x7fffd2c1ace0, realname=0x247de20 "/lib64/libgcc_s.so.1", loader=0x0, l_type=2, mode=-1879048191, stack_endp=0x7fffd2c1b028, nsid=0)
at dl-load.c:975
#8 0x00007f8dfda1236a in _dl_map_object (loader=0x0, name=0x7f8dfb596e3e "libgcc_s.so.1", type=2, trace_mode=0, mode=<value optimized out>, nsid=<value optimized out>) at dl-load.c:2274
#9 0x00007f8dfda1ca34 in dl_open_worker (a=0x7fffd2c1b250) at dl-open.c:227
#10 0x00007f8dfda181a6 in _dl_catch_error (objname=0x7fffd2c1b2a0, errstring=0x7fffd2c1b298, mallocedp=0x7fffd2c1b2af, operate=0x7f8dfda1c910 <dl_open_worker>, args=0x7fffd2c1b250) at dl-error.c:178
#11 0x00007f8dfda1c4ea in _dl_open (file=0x7f8dfb596e3e "libgcc_s.so.1", mode=-2147483647, caller_dlopen=0x0, nsid=-2, argc=8, argv=<value optimized out>, env=0x7fffd2c30020) at dl-open.c:569
#12 0x00007f8dfb568340 in do_dlopen (ptr=<value optimized out>) at dl-libc.c:86
#13 0x00007f8dfda181a6 in _dl_catch_error (objname=0x7fffd2c1b460, errstring=0x7fffd2c1b458, mallocedp=0x7fffd2c1b46f, operate=0x7f8dfb568300 <do_dlopen>, args=0x7fffd2c1b440) at dl-error.c:178
#14 0x00007f8dfb568497 in dlerror_run (name=<value optimized out>, mode=<value optimized out>) at dl-libc.c:47
#15 __libc_dlopen_mode (name=<value optimized out>, mode=<value optimized out>) at dl-libc.c:160
#16 0x00007f8dfb540895 in init () at ../sysdeps/ia64/backtrace.c:41
#17 0x00007f8dfb7e1b23 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:104
#18 0x00007f8dfb540994 in __backtrace (array=<value optimized out>, size=64) at ../sysdeps/ia64/backtrace.c:85
#19 0x00007f8dfb4b280b in __libc_message (do_abort=2, fmt=0x7f8dfb599fc0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178
#20 0x00007f8dfb4b8126 in malloc_printerr (action=3, str=0x7f8dfb59a2b8 "free(): invalid next size (fast)", ptr=<value optimized out>) at malloc.c:6311
#21 0x00007f8dfb4bac53 in _int_free (av=0x7f8dfb7d0e80, p=0x24d52c0, have_lock=0) at malloc.c:4811

(gdb) bt
#0  pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:95
#1  0x00007f8dfb540994 in __backtrace (array=<value optimized out>, size=64) at ../sysdeps/ia64/backtrace.c:85
#2  0x00007f8dfb4b280b in __libc_message (do_abort=2, fmt=0x7f8dfb599fc0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178
#3  0x00007f8dfb4b8126 in malloc_printerr (action=3, str=0x7f8dfb5980f9 "malloc(): memory corruption", ptr=<value optimized out>) at malloc.c:6311
#4  0x00007f8dfb4bbba4 in _int_malloc (av=0x7f8dfb7d0e80, bytes=<value optimized out>) at malloc.c:4411
#5  0x00007f8dfb4bc5e6 in __libc_calloc (n=<value optimized out>, elem_size=<value optimized out>) at malloc.c:4075
#6  0x00007f8dfda14d1f in _dl_new_object (realname=0x247de20 "/lib64/libgcc_s.so.1", libname=0x7f8dfb596e3e "libgcc_s.so.1", type=2, loader=0x0, mode=-1879048191, nsid=0) at dl-object.c:77
#7  0x00007f8dfda111ae in _dl_map_object_from_fd (name=0x7f8dfb596e3e "libgcc_s.so.1", fd=6, fbp=0x7fffd2c1ace0, realname=0x247de20 "/lib64/libgcc_s.so.1", loader=0x0, l_type=2, mode=-1879048191, stack_endp=0x7fffd2c1b028, nsid=0)
    at dl-load.c:975
#8  0x00007f8dfda1236a in _dl_map_object (loader=0x0, name=0x7f8dfb596e3e "libgcc_s.so.1", type=2, trace_mode=0, mode=<value optimized out>, nsid=<value optimized out>) at dl-load.c:2274
#9  0x00007f8dfda1ca34 in dl_open_worker (a=0x7fffd2c1b250) at dl-open.c:227
#10 0x00007f8dfda181a6 in _dl_catch_error (objname=0x7fffd2c1b2a0, errstring=0x7fffd2c1b298, mallocedp=0x7fffd2c1b2af, operate=0x7f8dfda1c910 <dl_open_worker>, args=0x7fffd2c1b250) at dl-error.c:178
#11 0x00007f8dfda1c4ea in _dl_open (file=0x7f8dfb596e3e "libgcc_s.so.1", mode=-2147483647, caller_dlopen=0x0, nsid=-2, argc=8, argv=<value optimized out>, env=0x7fffd2c30020) at dl-open.c:569
#12 0x00007f8dfb568340 in do_dlopen (ptr=<value optimized out>) at dl-libc.c:86
#13 0x00007f8dfda181a6 in _dl_catch_error (objname=0x7fffd2c1b460, errstring=0x7fffd2c1b458, mallocedp=0x7fffd2c1b46f, operate=0x7f8dfb568300 <do_dlopen>, args=0x7fffd2c1b440) at dl-error.c:178
#14 0x00007f8dfb568497 in dlerror_run (name=<value optimized out>, mode=<value optimized out>) at dl-libc.c:47
#15 __libc_dlopen_mode (name=<value optimized out>, mode=<value optimized out>) at dl-libc.c:160
#16 0x00007f8dfb540895 in init () at ../sysdeps/ia64/backtrace.c:41
#17 0x00007f8dfb7e1b23 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:104
#18 0x00007f8dfb540994 in __backtrace (array=<value optimized out>, size=64) at ../sysdeps/ia64/backtrace.c:85
#19 0x00007f8dfb4b280b in __libc_message (do_abort=2, fmt=0x7f8dfb599fc0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178
#20 0x00007f8dfb4b8126 in malloc_printerr (action=3, str=0x7f8dfb59a2b8 "free(): invalid next size (fast)", ptr=<value optimized out>) at malloc.c:6311
#21 0x00007f8dfb4bac53 in _int_free (av=0x7f8dfb7d0e80, p=0x24d52c0, have_lock=0) at malloc.c:4811

Revision history for this message

In Sourceware.org Bugzilla #16159, Darryl L. Miles (darryl-miles) wrote on 2013-11-13:

#7

See also bug#956

Revision history for this message

In Sourceware.org Bugzilla #16159, Carlos-0 (carlos-0) wrote on 2013-11-13:

#8

*** Bug 956 has been marked as a duplicate of this bug. ***

Revision history for this message

In Sourceware.org Bugzilla #16159, Carlos-0 (carlos-0) wrote on 2013-11-13:

#9

This is going to be difficult to fix and invasive. I will do my best to explain why.

At the point of the failure we want to be able to print a backtrace. The only way to get a reliable backtrace is to use the unwinder provided by gcc via libgcc_s.so.1 (this may vary by machine). In order to get access to the unwinder we must dlopen that shared library. During the dlopen process we need to calloc enough structures to hookup the new shared library into the structures used by the dynamic linker.

One resolution to this problem is to ensure that malloc has a fall-back allocation scheme that is robust against failure and then during the malloc_printerr we flip an internal bit and switch to the temporary reserve allocations. We could also create a new internal API for using the temporary allocations and then dlopen could use that in the event that we are crashing and need to dlopen one last library (the unwinder on demand). That would prevent other threads from consuming the reserve allocations after malloc_printerr is entered by another thread.

This is a considerable amount of work and we aren't going to get to this issue until a core developer or someone with serious interest commits to fixing this. Therefore I'm moving this to SUSPENDED until we find the resources to fix the issue.

This issue should remain open and new issues submited about this bug should be marked as duplicates of this issue.

Revision history for this message

In Sourceware.org Bugzilla #16159, Neleai (neleai) wrote on 2013-11-13:

#10

On Wed, Nov 13, 2013 at 03:57:02AM +0000, carlos at redhat dot com wrote:
> One resolution to this problem is to ensure that malloc has a fall-back
> allocation scheme that is robust against failure and then during the
> malloc_printerr we flip an internal bit and switch to the temporary reserve
> allocations. We could also create a new internal API for using the temporary
> allocations and then dlopen could use that in the event that we are crashing
> and need to dlopen one last library (the unwinder on demand). That would
> prevent other threads from consuming the reserve allocations after
> malloc_printerr is entered by another thread.
>
> This is a considerable amount of work and we aren't going to get to this issue
> until a core developer or someone with serious interest commits to fixing this.
> Therefore I'm moving this to SUSPENDED until we find the resources to fix the
> issue.
>
Why not reuse a singal-safe malloc for dlopen?

Revision history for this message

In Sourceware.org Bugzilla #16159, Darryl L. Miles (darryl-miles) wrote on 2013-11-13:

#11

This fancy backtrace stuff is nice and all but... the process must die!

Can't the pthread_once use a non-blocking lock ?

Can the lock be a recursive type ?

Can pthread_trylock() used in this non-critial path ? if already locked, and

if possible to check if locked by our thread-id ?

then we immediately abort the process (causing execution of the process to die, like it should). No backtrace is emitted, great!

How do I stop this fancy backtrace stuff from working ? I want to setup an environment variable to turn it off as a workaround ?

How do I make this fancy backtrace stuff work, by preloading the dlopen() stuff it might need, during initialization of malloc() ? I want to setup an environment variable for that too.

There is no need to actually fix the bug, you are over thinking the issue. But this fancy stuff needs to be turned off or preloaded, before the process gets into an undefined state (due to memory bug).

Revision history for this message

In Sourceware.org Bugzilla #16159, Darryl L. Miles (darryl-miles) wrote on 2013-11-13:

#12

Another idea, do not backtrace() every malloc() error, only the first one (the outer most one).

But right now the process deadlocks itself, on what looks to be a non-recursive mutex trying to do fancy backtrace on every malloc() problem found.

The process must die.

Revision history for this message

In Sourceware.org Bugzilla #16159, Neleai (neleai) wrote on 2013-11-13:

#13

On Wed, Nov 13, 2013 at 01:00:06PM +0000, darryl.miles at darrylmiles dot org wrote:
> How do I stop this fancy backtrace stuff from working ? I want to setup an
> environment variable to turn it off as a workaround ?
>
> How do I make this fancy backtrace stuff work, by preloading the dlopen() stuff
> it might need, during initialization of malloc() ? I want to setup an
> environment variable for that too.
>

As a quick workaround you can add following code to your application/preload this.

#include <execinfo.h>
static void __attribute__ ((constructor))
init_backtrace()
{
void *bt[10];
backtrace (bt, 10);
}

Revision history for this message

In Sourceware.org Bugzilla #16159, Bugdal (bugdal) wrote on 2013-11-13:

#14

Carlos, this is yet another reason why dlopen'ing libgcc_s is simply the wrong thing to do, and libgcc_eh should be static-linked into libc. (The other big reason is the possibility of pthread_cancel aborting the program.) At one time in the distant past, it was necessary for there to only be one copy of this code (and its data) in the whole program; otherwise, exception propagation (or backtracing) across DSOs would not work reliably. But modern unwinding code uses dl_iterate_phdr and works fine even if multiple copies of the code are present in the program.

Fixing this error in the way I describe will greatly simplify glibc and improve its reliability.

Revision history for this message

In Sourceware.org Bugzilla #16159, Carlos-0 (carlos-0) wrote on 2013-11-13:

#15

(In reply to Rich Felker from comment #9)
> Carlos, this is yet another reason why dlopen'ing libgcc_s is simply the
> wrong thing to do, and libgcc_eh should be static-linked into libc. (The
> other big reason is the possibility of pthread_cancel aborting the program.)
> At one time in the distant past, it was necessary for there to only be one
> copy of this code (and its data) in the whole program; otherwise, exception
> propagation (or backtracing) across DSOs would not work reliably. But modern
> unwinding code uses dl_iterate_phdr and works fine even if multiple copies
> of the code are present in the program.
>
> Fixing this error in the way I describe will greatly simplify glibc and
> improve its reliability.

That sounds like a good idea to me, I also agree that dlopening libgcc_s.so.1 always seemed like a terrible idea to me. We just need the resources to do the rewrite and fixup the linking to use libgcc_eh. I will leave this SUSPENDED until we find someone to clean this up.

Revision history for this message

In Sourceware.org Bugzilla #16159, Joseph-codesourcery (joseph-codesourcery) wrote on 2013-11-13:

#16

On Wed, 13 Nov 2013, bugdal at aerifal dot cx wrote:

> Carlos, this is yet another reason why dlopen'ing libgcc_s is simply the wrong
> thing to do, and libgcc_eh should be static-linked into libc. (The other big

Static-linking libgcc_eh into any glibc library is a bad idea because it
complicates bootstrapping: it means glibc built with an initial bootstrap
compiler (which was built without glibc headers available, implying full
EH functionality is not present in libgcc) is not identical to glibc built
with a compiler built using full shared glibc and headers. (It's *also* a
bad idea because new compilers can start using new DWARF unwind opcodes
that an old copy of the unwind code won't understand, causing problems
using new programs with old glibc.)

The answer for libpthread is for it to dlopen libgcc_s when loaded rather
than at pthread_cancel time (or to be made to depend (DT_NEEDED) on
libgcc_s in a way that doesn't require libgcc_s to be available when
libpthread is built). The answer for other cases is to disable the
backtracing by default as discussed in bug 12189 (possibly with an
environment variable, not available in setuid programs, that can reenable
it - in which case glibc would dlopen libgcc_s at startup).

Revision history for this message

In Sourceware.org Bugzilla #16159, Neleai (neleai) wrote on 2013-11-13:

#17

On Wed, Nov 13, 2013 at 04:12:53PM +0000, joseph at codesourcery dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=16159
>
> --- Comment #11 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> On Wed, 13 Nov 2013, bugdal at aerifal dot cx wrote:
>
> > Carlos, this is yet another reason why dlopen'ing libgcc_s is simply the wrong
> > thing to do, and libgcc_eh should be static-linked into libc. (The other big
>
> Static-linking libgcc_eh into any glibc library is a bad idea because it
> complicates bootstrapping: it means glibc built with an initial bootstrap
> compiler (which was built without glibc headers available, implying full
> EH functionality is not present in libgcc) is not identical to glibc built
> with a compiler built using full shared glibc and headers. (It's *also* a
> bad idea because new compilers can start using new DWARF unwind opcodes
> that an old copy of the unwind code won't understand, causing problems
> using new programs with old glibc.)
>
Why did you jump from dlopening to static linking? Dynamic linking would
work and if there is concern that user does not have one we could
provide a stub implementation and function to test if we deal with stub
or real one.

Revision history for this message

In Sourceware.org Bugzilla #16159, Bugdal (bugdal) wrote on 2013-11-13:

#18

Joseph, the bootstrapping issue can presumably be fixed (and bootstrapping made easier) simply by providing a way to install headers without building glibc. This may even allow you to shave one or more steps off of the full bootstrap process.

As for the issue of new DWARF opcodes, if they prevent older unwind code from being able to interpret the unwind information at all (rather than just failing to take advantage of the new features) that seems like a fundamental design bug elsewhere that should be reported. I'm not clear whether or not that's really the case.

With that said, I find your alternate fix proposal acceptable. For the libpthread issue, I believe the DT_NEEDED could be generated at build time using a fake libgcc_s.so.1 in the glibc source tree. As for disabling backtrace by default, that's perfectly acceptable. Alternatively, glibc could always attempt to load libgcc_s.so.1 at startup and disable backtrace if it's not found.

Revision history for this message

In Sourceware.org Bugzilla #16159, Bugdal (bugdal) wrote on 2013-11-13:

#19

Ondrej, is having glibc contain a DT_NEEDED entry for libgcc.so.1 really an option that's on the table? I think this would also interfere with bootstrapping issues Joseph and others may be concerned about, as well as hurting load-time performance.

Revision history for this message

In Sourceware.org Bugzilla #16159, Joseph-codesourcery (joseph-codesourcery) wrote on 2013-11-13:

#20

On Wed, 13 Nov 2013, neleai at seznam dot cz wrote:

> Why did you jump from dlopening to static linking? Dynamic linking would
> work and if there is concern that user does not have one we could
> provide a stub implementation and function to test if we deal with stub
> or real one.

I don't think default dlopening libgcc_s from libc at startup is desirable
on performance grounds (most programs will never need it), whereas from
libpthread it's likely to be less significant.

Revision history for this message

In Sourceware.org Bugzilla #16159, Joseph-codesourcery (joseph-codesourcery) wrote on 2013-11-13:

#21

On Wed, 13 Nov 2013, bugdal at aerifal dot cx wrote:

> Joseph, the bootstrapping issue can presumably be fixed (and bootstrapping made
> easier) simply by providing a way to install headers without building glibc.

There already is. But to install the correct set of headers (some
generated at build time) you first need an appropriately configured
compiler to configure glibc. That's the old three-compiler bootstrap
process: first build a basic compiler, then install headers with it and
crt*.o and build a dummy libc.so, then build a second compiler with shared
libgcc, then build glibc, then build a third compiler. I changed things
in glibc and GCC so that a two-compiler process suffices: the initial
compiler built without headers can build glibc and the result is identical
to what you get if you repeatedly alternate GCC and glibc builds.
(Ideally you'd have a one-compiler process, where the second compiler
build only builds/rebuilds GCC's runtime libraries where they depend on
system headers or shared glibc, not GCC itself.)

Revision history for this message

In Sourceware.org Bugzilla #16159, Neleai (neleai) wrote on 2013-11-14:

#22

Joseph, do you have a benchmark to measure libgcc overhead?

I tried a following

cat "int main()
{
return 42;
}" > x.c
gcc x.c -O3 -o nogcc
gcc x.c -O3 -lgcc -o withgcc
time for I in `seq 1 10000`; do ./nogcc; done
time for I in `seq 1 10000`; do ./withgcc; done

And I cannot distinguish these from noise. When I linked with -lpthread there was a noticable slowdown.

Revision history for this message

In Sourceware.org Bugzilla #16159, Bugdal (bugdal) wrote on 2013-11-14:

#23

Ondrej, did you even check your results with readelf or ldd? -lgcc is a static library and is always linked, so of course it won't make any difference. You need to test with -lgcc_s (and double-check to make sure the dependency really got added).

BTW, I'm not sure how well your test will do measuring exec time versus other overhead. If you'd like, I have a test I can post that execs itself and measures the actual time from just before the execve syscall to the start of main.

Revision history for this message

In Sourceware.org Bugzilla #16159, Neleai (neleai) wrote on 2013-11-14:

#24

On Thu, Nov 14, 2013 at 03:54:30PM +0000, bugdal at aerifal dot cx wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=16159
>
> --- Comment #18 from Rich Felker <bugdal at aerifal dot cx> ---
> Ondrej, did you even check your results with readelf or ldd? -lgcc is a static
> library and is always linked, so of course it won't make any difference. You
> need to test with -lgcc_s (and double-check to make sure the dependency really
> got added).
>
I asked for benchmark because of that, with a lgcc_s there is difference.

plain

real 0m3.039s
user 0m0.195s
sys 0m3.049s

with lgcc_s

real 0m3.141s
user 0m0.169s
sys 0m3.179s

with lpthread

real 0m3.282s
user 0m0.182s
sys 0m3.308s

> BTW, I'm not sure how well your test will do measuring exec time versus other
> overhead. If you'd like, I have a test I can post that execs itself and
> measures the actual time from just before the execve syscall to the start of
> main.
>
These also count as I wanted to show a relative performance impact. If
this is taken into extreme we could improve performance by staticaly linking lm and lpthread

Or using prelink.

Revision history for this message

In Sourceware.org Bugzilla #16159, Bugdal (bugdal) wrote on 2013-11-14:

#25

On Thu, Nov 14, 2013 at 04:47:48PM +0000, neleai at seznam dot cz wrote:
> These also count as I wanted to show a relative performance impact. If

I agree this approach makes sense, but the relative performance impact
could change when the program (possibly linked with libgcc_s) is
invoked via posix_spawn or vfork+exec from a high-load server versus
as part of an inefficient shell script where the shell may have a lot
of additional syscall overhead on each command (this might also vary
between shells; dash or busybox ash might perform very differently
from bash). So while we may not care about the most extreme impact, I
think it's important to consider how large the relative overhead is
when the invocation conditions are a low-overhead, real-world
scenario.

> this is taken into extreme we could improve performance by staticaly linking lm
> and lpthread

Yes, of course -- actually, I would recommend merging all of the glibc
.so's into libc.so, but I understand that the current situation with
symbol versions greatly complicates this, and that there might be
other issues. It would certainly improve load-time performance and
memory overhead for small programs, though. But I think this is
outside the scope of this bug report. The interest in looking at
performance here is asking whether a proposed change would make
performance noticably worse (a regression), not how we can best
optimize startup performance.

Revision history for this message

In Sourceware.org Bugzilla #16159, Eric Blake (eblake) wrote on 2013-11-28:

#26

(In reply to Darryl Miles from comment #6)
> How do I stop this fancy backtrace stuff from working ? I want to setup an
> environment variable to turn it off as a workaround ?

According to:
https://lists.gnu.org/archive/html/bug-gnulib/2013-11/msg00103.html
setting MALLOC_CHECK_=2 in the environment is sufficient to prevent the error message attempts; but that sounds like something you set at program start rather than something we can do via setenv() at the time of reporting the first error (because setenv uses malloc).

> There is no need to actually fix the bug, you are over thinking the issue.

Yes, there IS a need to fix something. The link above points to a case of a user that is unhappy that their ./configure failed because the conftest program hung after tickling a malloc corruption bug in regex. Configure should never hang (thankfully, configure tests are one case where the MALLOC_CHECK_=2 trick may be sufficient - someone probing for known glibc bugs doesn't care about a bactrace, only about successful exit status).

Maarten Lankhorst (mlankhorst) on 2014-01-06

Changed in binutils (Ubuntu):
importance:	Undecided → High

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-01-06:

#1

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in binutils (Ubuntu):
status:	New → Confirmed

Revision history for this message

Matthias Klose (doko) wrote on 2014-01-10:

#2

while ld shouldn't stall, xorg-server shouldn't call configure with these flags.

the reason for this are some hardening bits in debian/rules without a build-dependency on hardening-wrapper.

Changed in xorg-server (Ubuntu):
importance:	Undecided → Critical
milestone:	none → ubuntu-14.01
status:	New → Triaged

Revision history for this message

Matthias Klose (doko) wrote on 2014-01-10:

#3

same with evolution-data-server

Changed in evolution-data-server (Ubuntu Trusty):
importance:	Undecided → Critical
milestone:	none → ubuntu-14.01
status:	New → Triaged

Matthias Klose (doko) on 2014-01-14

tags:

added: ftbfs

Revision history for this message

Steve Beattie (sbeattie) wrote on 2014-01-28:

#4

So it turns out that ld hanging on the backtrace is actually glibc bug https://sourceware.org/bugzilla/show_bug.cgi?id=16159 getting tickled. Setting the MALLOC_CHECK_ environment variable causes it not to hang, by not trying to emit the back trace and deadlocking on reacquiring the malloc lock:

  $ MALLOC_CHECK_=2 gcc -o /tmp/conftest -fPIE -pie -static conftest.c
  collect2: error: ld terminated with signal 6 [Aborted], core dumped
  /usr/bin/ld: BFD (GNU Binutils for Ubuntu) 2.24 assertion fail ../../bfd/elflink.c:13053
  /usr/bin/ld: BFD (GNU Binutils for Ubuntu) 2.24 assertion fail ../../bfd/elflink.c:13053
  /usr/bin/ld: BFD (GNU Binutils for Ubuntu) 2.24 assertion fail ../../bfd/elflink.c:13053
  /usr/bin/ld: BFD (GNU Binutils for Ubuntu) 2.24 assertion fail ../../bfd/elflink.c:13053
  /usr/bin/ld: BFD (GNU Binutils for Ubuntu) 2.24 assertion fail ../../bfd/elflink.c:13053

Bug Watch Updater (bug-watch-updater) on 2014-01-28

Changed in eglibc:
importance:	Unknown → Medium
status:	Unknown → Incomplete

Revision history for this message

Steve Beattie (sbeattie) wrote on 2014-01-29:

#27

ureadahead_0.100.0-17.debdiff Edit (2.2 KiB, text/plain)

Unfortunately, the workaround prescribed, adding hardening-wrapper as a build dependency doesn't always work, and it's not clear why it does work occasionally. First, in order for hardened-cc to do anything at all, DEB_BUILD_HARDENING needs to be set, and second, if it detects '-static' or other position independent executable incompatible arguments, it only prevents itself from adding -pie; it does not filter it out from the command line if it's already there. In these cases, -pie is already present, having been added via DEB_BUILD_MAINT_OPTIONS or some other way in the debian/rules file.

The most proper way that I can see to address this would be to rely on the default dpkg-buildflags to get the basic level of protections. Then to get all the protections, build depend on hardening-wrapper and export DEB_BUILD_HARDENING=1 in debian/rules. I've attached a debdiff that I've verified builds on all available architectures for ureadahead, since that package is also hitting this issue.

The least invasive workaround would be to export MALLOC_CHECK=2 at build time (i.e. in debian/rules), as this causes glibc to abort without attempting to produce a backtrace when it detects internal malloc corruption. This unfortunately still leaves configure believing that 'gcc -static' doesn't work, but it at least causes builds not to hang.

Revision history for this message

Steve Beattie (sbeattie) wrote on 2014-01-29:

#28

evolution-data-server_3.10.3-0ubuntu2.debdiff Edit (2.1 KiB, text/plain)

Here's a similar debdiff for e-d-s, confirmed to build on i386.

Revision history for this message

Ubuntu Foundations Team Bug Bot (crichton) wrote on 2014-01-29:

#29

The attachment "ureadahead_0.100.0-17.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags:

added: patch

Revision history for this message

Matthias Klose (doko) wrote on 2014-01-29: Re: [Bug 1266492] Re: ld:i386 crashes with -static -fPIE -pie

#30

Am 29.01.2014 09:39, schrieb Steve Beattie:
> The most proper way that I can see to address this would be to rely on
> the default dpkg-buildflags to get the basic level of protections. Then
> to get all the protections, build depend on hardening-wrapper and export
> DEB_BUILD_HARDENING=1 in debian/rules. I've attached a debdiff that I've
> verified builds on all available architectures for ureadahead, since
> that package is also hitting this issue.

yes that was the fix I had in mind.

> The least invasive workaround would be to export MALLOC_CHECK=2 at build
> time (i.e. in debian/rules), as this causes glibc to abort without
> attempting to produce a backtrace when it detects internal malloc
> corruption. This unfortunately still leaves configure believing that
> 'gcc -static' doesn't work, but it at least causes builds not to hang.

sure, that would mimic the behaviour we did see before the glibc update, but
papering over the original issue. In practice this shouldn't be an issue
because we only build static binaries in very few cases.

Revision history for this message

Steve Beattie (sbeattie) wrote on 2014-01-30:

#31

local-no-malloc-backtrace.diff Edit (1.8 KiB, text/plain)

Here's a patch to glibc to set the default value of MALLOC_CHECK_ to 1 (from 3). By doing so, the malloc specific error passed to malloc_printerr() will still be displayed by default, but libc will not attempt to generate a backtrace, which is what is causing the deadlock to occur. Even if the deadlock weren't a problem, it's also valuable from a security perspective, as attempting to malloc() from the same pool that libc has already detected an attacker has corrupted is likely unsafe, and may grant an attacker a chance to regain control. This is also the reason for adding the MALLOC_CHECK_ variable to the list of environment variables for filtering when setuid/setgid programs are invoked.

People wishing to see the backtrace for debugging purposes can get the old default behavior back by setting MALLOC_CHECK_=3 in their environment.

I've verified that eglibc builds fine with this change, and that xorg-server 2:1.14.5-1ubuntu2 (not containing the workaround that Martin added in 2:1.14.5-1ubuntu3, thus would normally trigger the ld/glibc hang on i386) also builds fine when built against eglibc with this patch on all arches.

Fixing this of course doesn't address the binutils bug where ld is corrupting malloc space, or the dpkg-buildflags hardening flaw around -static and -pie (doko, is there a bug already for that?), but it will stop builds from hanging.

Note that I don't have upload privileges, so all my patches will need to be sponsored.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-02-03:

#32

This bug was fixed in the package xorg-server - 2:1.14.5-1ubuntu4

---------------
xorg-server (2:1.14.5-1ubuntu4) trusty; urgency=medium

* Build xserver-xorg-core-udeb on arm64 and ppc64el.
-- Colin Watson <email address hidden> Mon, 03 Feb 2014 15:44:50 +0000

Changed in xorg-server (Ubuntu Trusty):
status:	Triaged → Fix Released

Revision history for this message

Iain Lane (laney) wrote on 2014-02-11:

#33

Thanks, I committed the e-d-s fix to bzr so it should be in the next upload. Unsubscribing sponsors since there's nothing left here to sponsor.

Changed in evolution-data-server (Ubuntu Trusty):
status:	Triaged → Fix Committed

Adam Conrad (adconrad) on 2014-02-23

affects:

eglibc → glibc

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-02-24:

#34

This bug was fixed in the package eglibc - 2.19-0ubuntu2

---------------
eglibc (2.19-0ubuntu2) trusty; urgency=medium

  * Merge with unreleased 2.19 from Debian experimental, fixing some bugs:
    - debian/patches/any/local-no-malloc-backtrace.diff: Lower the default
      for MALLOC_CHECK_ to 1, and add it to the list of insecure variables
      that can't be set for suid binaries. This allows us to not backtrace
      malloc failures by default (Closes: #739913, LP: #1266492) and skips
      backtrace for suid binaries where an attacker calling into a corrupt
      malloc internal data structure with malloc could lead to Bad Things.
    - Make ldconfig stop operating on the linker entirely, so our packaged
      symlinks take precedence and hack the postinst to skip ldconfig when
      we detect a broken setup that the old ldconfig mangles (LP: #915995)
-- Adam Conrad <email address hidden> Sun, 23 Feb 2014 22:39:18 -0700

Changed in eglibc (Ubuntu Trusty):
status:	New → Fix Released

Revision history for this message

Brian Murray (brian-murray) wrote on 2014-04-15:

#35

There never was another upload of evolution-data-server so that task remains unfixed.

Changed in evolution-data-server (Ubuntu Trusty):
milestone:	ubuntu-14.01 → ubuntu-14.04.1

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-05-22:

#36

This bug was fixed in the package evolution-data-server - 3.10.4-0ubuntu2

---------------
evolution-data-server (3.10.4-0ubuntu2) utopic; urgency=low

* debian/patches/git_ews_locking.patch: backport a fix for a bug leading
to having client to freeze sometimes with ews calendars (lp: #1311213)

  [ Steve Beattie ]
  * debian/control: build depend on hardening-wrapper
  * debian/rules: reenable hardening via hardening-wrapper and
    DEB_BUILD_HARDENING as a workaround for configure hanging when
    checking gcc's -static option. (LP: #1266492)
-- Sebastien Bacher <email address hidden> Mon, 12 May 2014 18:12:17 +0200

Changed in evolution-data-server (Ubuntu):
status:	Fix Committed → Fix Released

Revision history for this message

CSRedRat (csredrat) wrote on 2014-06-23:

#37

When this fixed in 14.04 Trusty Tahr for 14.04.1 (24 July)?

Many "critical" bugs on ReleaseNotes Trusty Tahr page don't fixed presently: https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes#Known_issues
Installation bugs too:
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1066480
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1172572
https://bugs.launchpad.net/ubuntu/+source/graphite2/+bug/1303516
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1297851
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1066342
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1172161
https://bugs.launchpad.net/ubuntu/+source/console-setup/+bug/1297234

upgrade:
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1308530
https://bugs.launchpad.net/ubuntu/+source/tex-common/+bug/1304972
https://bugs.launchpad.net/ubuntu/+source/flightgear/+bug/1308338

and other:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1054732
https://bugs.launchpad.net/unity/+bug/1305586
https://bugs.launchpad.net/ubuntu/+source/gnome-keyring/+bug/1271591
https://bugs.launchpad.net/ubuntu/+source/unity/+bug/1308037
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1305522
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1308761
https://bugs.launchpad.net/ubuntu/+source/unity-greeter/+bug/1292467

Too many errors in Trusty on: http://errors.ubuntu.com
Please fix the most annoying bugs. It interferes with work, use and distribution advice to friends.

Revision history for this message

Mathew Hodson (mhodson) wrote on 2014-07-08:

#38

There never was another upload of evolution-data-server committed for trusty.

Changed in evolution-data-server (Ubuntu Trusty):
status:	Fix Committed → Confirmed

Revision history for this message

Sebastien Bacher (seb128) wrote on 2014-09-25:

#39

seems like there is nothing there to sponsor, unsubscribing the sponsors

Mathew Hodson (mhodson) on 2015-09-17

Changed in evolution-data-server (Ubuntu Trusty):
milestone:	ubuntu-14.04.1 → trusty-updates
Changed in xorg-server (Ubuntu):
milestone:	ubuntu-14.01 → none
Changed in evolution-data-server (Ubuntu):
milestone:	ubuntu-14.04.1 → none
Changed in eglibc (Ubuntu):
importance:	Undecided → Medium
Changed in eglibc (Ubuntu Trusty):
importance:	Undecided → Medium

Bug Watch Updater (bug-watch-updater) on 2015-09-17

Changed in eglibc (Debian):
status:	Unknown → Fix Released

Mathew Hodson (mhodson) on 2015-10-28

Changed in evolution-data-server (Ubuntu Trusty):
status:	Confirmed → Triaged

Diego (dmggears3) on 2016-11-17

Changed in binutils (Ubuntu Trusty):
assignee:	nobody → Diego (dmggears3)

Revision history for this message

Rolf Leggewie (r0lf) wrote on 2020-02-12:

#40

closing task for trusty

Changed in evolution-data-server (Ubuntu Trusty):
status:	Triaged → Invalid
Changed in binutils (Ubuntu Trusty):
status:	Confirmed → Invalid

Ubuntu
evolution-data-server package

ld:i386 crashes with -static -fPIE -pie

Bug Description

Related branches

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

	Status	Importance	Assigned to	Milestone
GLibC	Incomplete	Medium	sourceware-bugs #16159
binutils (Ubuntu)	Confirmed	High	Unassigned
Trusty	Invalid	High	Diego
eglibc (Debian)	Fix Released	Unknown	debbugs #739913
eglibc (Ubuntu)	Fix Released	Medium	Unassigned
Trusty	Fix Released	Medium	Unassigned
evolution-data-server (Ubuntu)	Fix Released	Critical	Unassigned
Trusty	Invalid	Critical	Unassigned	Ubuntu trusty-updates
xorg-server (Ubuntu)	Fix Released	Critical	Unassigned
Trusty	Fix Released	Critical	Unassigned	Ubuntu ubuntu-14.01

Ubuntuevolution-data-server package

ld:i386 crashes with -static -fPIE -pie

Bug Description

Related branches

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Ubuntu
evolution-data-server package