Firefox built with gcc-4.5 is a non-starter on i386 with -pie

Bug #663294 reported by Chris Coulson on 2010-10-19
32
This bug affects 3 people
Affects Status Importance Assigned to Milestone
binutils
Fix Released
Medium
binutils (Ubuntu)
Medium
Matthias Klose
Natty
Medium
Matthias Klose
firefox (Ubuntu)
Medium
Chris Coulson
Natty
Medium
Chris Coulson

Bug Description

NOTE: See comment 17 for a test case, and following comments for analysis.

Binary package hint: gcc-4.5

I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.

It is hanging here:

#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2

...which is happening even before entering main()

If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:

$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
  __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}

...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():

static inline void *
imalloc(size_t size)
{

 assert(size != 0);

 if (size <= arena_maxclass)
  return (arena_malloc(choose_arena(), size, false));
 else
  return (huge_malloc(size, false));

Note that in this case, size < arena_maxclass.

choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:

(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
  __align = 0}

(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac

The mutex looks correctly initialized here, and is unlocked.

If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:

(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0

...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc

Also:

(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
  __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}

So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;

(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99

Here is the interesting part of malloc disassembled:

   0xf7ff66a0 <+0>: push %ebp
   0xf7ff66a1 <+1>: mov %esp,%ebp
   0xf7ff66a3 <+3>: sub $0x58,%esp
   0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
   0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
   0xf7ff66ae <+14>: add $0x880e,%ebx
   0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
   0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
   0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
   0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
   0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
   0xf7ff66c6 <+38>: test %esi,%esi
   0xf7ff66c8 <+40>: mov $0x1,%eax
   0xf7ff66cd <+45>: cmove %eax,%esi
   0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
   0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
   0xf7ff66d8 <+56>: mov %esi,%eax
   0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
   0xf7ff66df <+63>: mov %eax,%esi
   0xf7ff66e1 <+65>: test %esi,%esi
   0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
   0xf7ff66e5 <+69>: mov %esi,%eax
   0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
   0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
   0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
   0xf7ff66f0 <+80>: mov %ebp,%esp
   0xf7ff66f2 <+82>: pop %ebp
   0xf7ff66f3 <+83>: ret
   0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
   0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
   0xf7ff66fd <+93>: test %al,%al
   0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
   0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
   0xf7ff6706 <+102>: xor %esi,%esi
   0xf7ff6708 <+104>: movl $0xc,(%eax)
   0xf7ff670e <+110>: mov %esi,%eax
   0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
   0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
   0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
   0xf7ff6719 <+121>: mov %ebp,%esp
   0xf7ff671b <+123>: pop %ebp
   0xf7ff671c <+124>: ret
   0xf7ff671d <+125>: lea 0x0(%esi),%esi
   0xf7ff6720 <+128>: mov %gs:0x0,%eax
   0xf7ff6726 <+134>: nop
   0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
   0xf7ff672b <+139>: lea 0x0,%edx
   0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
   0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
   0xf7ff6737 <+151>: test %edi,%edi
   0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
   0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
   0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
   0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
   0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
   0xf7ff6757 <+183>: sub $0x1,%esi
   0xf7ff675a <+186>: mov %esi,%edx
   0xf7ff675c <+188>: shr %edx
   0xf7ff675e <+190>: or %esi,%edx
   0xf7ff6760 <+192>: mov %edx,%eax
   0xf7ff6762 <+194>: shr $0x2,%eax
   0xf7ff6765 <+197>: or %edx,%eax
   0xf7ff6767 <+199>: mov %eax,%edx
   0xf7ff6769 <+201>: shr $0x4,%edx
   0xf7ff676c <+204>: or %eax,%edx
   0xf7ff676e <+206>: mov %edx,%eax
   0xf7ff6770 <+208>: shr $0x8,%eax
   0xf7ff6773 <+211>: or %edx,%eax
   0xf7ff6775 <+213>: mov $0xffffffff,%edx
   0xf7ff677a <+218>: mov %eax,%esi
   0xf7ff677c <+220>: shr $0x10,%esi
   0xf7ff677f <+223>: or %eax,%esi
   0xf7ff6781 <+225>: add $0x1,%esi
   0xf7ff6784 <+228>: mov %esi,%eax
   0xf7ff6786 <+230>: shr $0x2,%eax
   0xf7ff6789 <+233>: bsf %eax,%eax
   0xf7ff678c <+236>: cmove %edx,%eax
   0xf7ff678f <+239>: cmp $0x1,%esi
   0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
   0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
   0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
   0xf7ff67a0 <+256>: mov $0x2,%eax
   0xf7ff67a5 <+261>: cmova %esi,%eax
   0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
   0xf7ff67ab <+267>: mov %edi,(%esp)
   0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>

So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).

From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:

   0xf7ff6720 <+128>: mov %gs:0x0,%eax
   0xf7ff672b <+139>: lea 0x0,%edx
   0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi

I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:

(gdb) print *0xf3a4faac
$10 = −208347070

...which is 0xf394e040

I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4

John Vivirito (gnomefreak) wrote :

This is confirmed as this also happens on xulrunner not configuring during updating packages.

Changed in gcc-4.5 (Ubuntu):
status: New → Confirmed
description: updated
Chris Coulson (chrisccoulson) wrote :

Note that I tried Debian's build of GCC, with the same issue. But, if I build the source straight from mozilla-central, then I don't have any issues (with either build of GCC).

Building our package without DEB_BUILD_HARDENING makes the problem go away. From looking at the hardening wrapper script, this narrows the problem down to -fstack-protector or -D_FORTIFY_SOURCE=2

Chris Coulson (chrisccoulson) wrote :
Download full text (4.9 KiB)

Ok, with a good build, I see the following:

(gdb) print arenas[0]
$1 = (arena_t *) 0xf58d2040
(gdb) print &arenas[0]
$2 = (arena_t **) 0xf58d2000

Inside arena_malloc:

(gdb) print arena
$3 = (arena_t *) 0xf58d2040

Yay \o/

And:

(gdb) info registers
eax 0xf59d3870 -174245776
ecx 0x3 3
edx 0xfffffffc -4
ebx 0x80548e8 134564072
esp 0xffffd484 0xffffd484
ebp 0xffffd4dc 0xffffd4dc
esi 0x160 352
edi 0xf58d2040 -175300544
eip 0x804f0af 0x804f0af <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99

And the interesting bit of malloc disassembled:

   0x0804f010 <+0>: push %ebp
   0x0804f011 <+1>: mov %esp,%ebp
   0x0804f013 <+3>: sub $0x58,%esp
   0x0804f016 <+6>: mov %ebx,-0xc(%ebp)
   0x0804f019 <+9>: call 0x8049853 <__i686.get_pc_thunk.bx>
   0x0804f01e <+14>: add $0x58ca,%ebx
   0x0804f024 <+20>: mov %esi,-0x8(%ebp)
   0x0804f027 <+23>: mov 0x8(%ebp),%esi
   0x0804f02a <+26>: mov %edi,-0x4(%ebp)
   0x0804f02d <+29>: cmpb $0x0,0x1230(%ebx)
   0x0804f034 <+36>: je 0x804f068 <malloc+88>
   0x0804f036 <+38>: test %esi,%esi
   0x0804f038 <+40>: jne 0x804f03e <malloc+46>
   0x0804f03a <+42>: mov $0x1,%si
   0x0804f03e <+46>: cmp 0x194(%ebx),%esi
   0x0804f044 <+52>: jbe 0x804f090 <malloc+128>
   0x0804f046 <+54>: mov %esi,%eax
   0x0804f048 <+56>: call 0x804e5f0 <huge_malloc>
   0x0804f04d <+61>: mov %eax,%esi
   0x0804f04f <+63>: test %esi,%esi
   0x0804f051 <+65>: je 0x804f071 <malloc+97>
   0x0804f053 <+67>: mov %esi,%eax
   0x0804f055 <+69>: mov -0xc(%ebp),%ebx
   0x0804f058 <+72>: mov -0x8(%ebp),%esi
   0x0804f05b <+75>: mov -0x4(%ebp),%edi
   0x0804f05e <+78>: mov %ebp,%esp
   0x0804f060 <+80>: pop %ebp
   0x0804f061 <+81>: ret
   0x0804f062 <+82>: lea 0x0(%esi),%esi
   0x0804f068 <+88>: call 0x804cee0 <malloc_init_hard>
   0x0804f06d <+93>: test %al,%al
   0x0804f06f <+95>: je 0x804f036 <malloc+38>
   0x0804f071 <+97>: call 0x804929c <__errno_location@plt>
   0x0804f076 <+102>: xor %esi,%esi
   0x0804f078 <+104>: movl $0xc,(%eax)
   0x0804f07e <+110>: mov %esi,%eax
   0x0804f080 <+112>: mov -0xc(%ebp),%ebx
   0x0804f083 <+115>: mov -0x8(%ebp),%esi
   0x0804f086 <+118>: mov -0x4(%ebp),%edi
   0x0804f089 <+121>: mov %ebp,%esp
   0x0804f08b <+123>: pop %ebp
   0x0804f08c <+124>: ret
   0x0804f08d <+125>: lea 0x0(%esi),%esi
   0x0804f090 <+128>: mov %gs:0x0,%eax
   0x0804f096 <+134>: nop
   0x0804f097 <+135>: lea 0x0(%esi,%eiz,1),%esi
   0x0804f09b <+139>: lea 0xfffffffc,%edx
   0x0804f0a1 <+145>: mov %edx,-0x1c(%ebp)
   0x0804f0a4 <+148>: mov (%edx,%eax,1),%edi
   0x0804f0a7 <+151>: test %edi,%edi
   0x0804f0a9 <+153>: je 0x804f2e0 <malloc+720>
   0x0804f0af <+159>: cmp 0x120c(%ebx),%esi
   0x0804f0b5 <+165>: ja 0x804f220 <malloc+528>
   0x0804f0bb <+171>: cmp 0x1204(%ebx),%esi
   0x0804f0c1 <+177>: jae 0x804f1d8 <malloc+456>
   0x0804f0c7 <+183>: sub $0x1,%esi
   ...

Read more...

Changed in gcc-4.5 (Ubuntu):
importance: Undecided → High
Chris Coulson (chrisccoulson) wrote :

Ok, it doesn't seem to be -fstack-protector

Chris Coulson (chrisccoulson) wrote :

Right, narrowed this down to PIE now. I did 2 parallel builds of mozilla-central, both with DEB_BUILD_HARDENING=1, and 1 of these with DEB_BUILD_HARDENING_PIE=0.

The build with DEB_BUILD_HARDENING_PIE=0 works fine, and the other build deadlocks at startup

Chris Coulson (chrisccoulson) wrote :

Just to close the loop entirely on this, as I showed the instructions used when reading the pointer stored in arenas_map, here are the instructions which write that value, which clearly show it writes to a different address to the one read from:

    5552: e8 39 ed ff ff call 4290 <arenas_extend>
    5557: 8b 83 c4 01 00 00 mov 0x1c4(%ebx),%eax
    555d: 8b 30 mov (%eax),%esi
    555f: 85 f6 test %esi,%esi
    5561: 0f 84 03 05 00 00 je 5a6a <.L488+0x14a>
    5567: 65 a1 00 00 00 00 mov %gs:0x0,%eax
    556d: 81 e8 04 00 00 00 sub $0x4,%eax
    5573: 89 30 mov %esi,(%eax)

description: updated
description: updated
Chris Coulson (chrisccoulson) wrote :

I've recreated the same issue with gcc-snapshot 20101012-0ubuntu1 as well now

Chris Coulson (chrisccoulson) wrote :

Here is the output of objdump -d firefox-bin with gcc-snapshot and -pie

Chris Coulson (chrisccoulson) wrote :

The disassembly shows the same behaviour with gcc-snapshot. In malloc_init_hard, arenas_map is initialized with arenas[0] here:

    5626: e8 85 eb ff ff call 41b0 <arenas_extend>
    562b: 8b 83 60 12 00 00 mov 0x1260(%ebx),%eax
    5631: 8b 30 mov (%eax),%esi
    5633: 85 f6 test %esi,%esi
    5635: 0f 84 7b 04 00 00 je 5ab6 <.L517+0x12e>
    563b: 65 a1 00 00 00 00 mov %gs:0x0,%eax
    5641: 81 e8 04 00 00 00 sub $0x4,%eax
    5647: bd 20 00 00 00 mov $0x20,%ebp
    564c: bf ab aa aa aa mov $0xaaaaaaab,%edi
    5651: 89 30 mov %esi,(%eax)

And arenas[0] is initialized here to the return value of base_alloc in arenas_extend:

    4509: 8b 93 60 12 00 00 mov 0x1260(%ebx),%edx
    450f: 8b 44 24 28 mov 0x28(%esp),%eax
    4513: 8d 04 82 lea (%edx,%eax,4),%eax
    4516: 8b 54 24 18 mov 0x18(%esp),%edx
    451a: 89 10 mov %edx,(%eax)

...with base_alloc being called here:

    41e2: e8 59 fe ff ff call 4040 <base_alloc>
    41e7: 85 c0 test %eax,%eax
    41e9: 89 44 24 18 mov %eax,0x18(%esp)

Again, in malloc, this is where pthread_mutex_lock is called (where it locks up);

    741f: 89 3c 24 mov %edi,(%esp)
    7422: e8 2d a0 ff ff call 1454 <pthread_mutex_lock@plt>

...with edi being initialized here:

    7398: 65 a1 00 00 00 00 mov %gs:0x0,%eax
    739e: 90 nop
    739f: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
    73a3: 8d 2d 00 00 00 00 lea 0x0,%ebp
    73a9: 8b 7c 05 00 mov 0x0(%ebp,%eax,1),%edi
    73ad: 85 ff test %edi,%edi

....which should be the same value as arenas_map, but is being initialized with a value from a memory location which is 4 bytes out

Chris Coulson (chrisccoulson) wrote :

Output of objdump -d firefox-bin with gcc-4.5 and no -pie

Chris Coulson (chrisccoulson) wrote :

As mentioned earlier, the major difference without -pie is that the value of edi is initialized correctly before it's contents are put on the stack before calling pthread_mutex_lock:

 804f250: 65 a1 00 00 00 00 mov %gs:0x0,%eax
 804f256: 90 nop
 804f257: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
 804f25b: 8d 15 fc ff ff ff lea 0xfffffffc,%edx
 804f261: 89 55 e4 mov %edx,-0x1c(%ebp)
 804f264: 8b 3c 02 mov (%edx,%eax,1),%edi
 804f267: 85 ff test %edi,%edi

Chris Coulson (chrisccoulson) wrote :

Here is the output of objdump -d firefox-4.0-bin with gcc4.4 and -pie

The major difference is with how pthread_mutex_lock is called in malloc:

    937d: 8b 4d e4 mov -0x1c(%ebp),%ecx
    9380: 89 0c 24 mov %ecx,(%esp)
    9383: e8 28 7f ff ff call 12b0 <pthread_mutex_lock@plt>

....the value loaded on to the stack comes from earlier, here:

    92f8: 65 a1 00 00 00 00 mov %gs:0x0,%eax
    92fe: 81 e8 04 00 00 00 sub $0x4,%eax
    9304: 8b 00 mov (%eax),%eax
    9306: 85 c0 test %eax,%eax
    9308: 89 45 e4 mov %eax,-0x1c(%ebp)

...note that the lea instruction has been replaced with a sub, and now loads the contents of the correct address in to eax

summary: - Firefox built with gcc-4.5 is a non-starter on i386
+ Firefox built with gcc-4.5 is a non-starter on i386 with -pie
Changed in gcc-4.5 (Ubuntu):
status: Confirmed → Fix Committed
Changed in gcc-4.5 (Ubuntu):
status: Fix Committed → Triaged
Kees Cook (kees) on 2011-02-15
Changed in gcc-4.5 (Ubuntu Natty):
milestone: none → natty-alpha-3
assignee: nobody → Canonical Desktop Team (canonical-desktop-team)
assignee: Canonical Desktop Team (canonical-desktop-team) → Canonical Foundations Team (canonical-foundations)
Kees Cook (kees) on 2011-02-15
Changed in firefox (Ubuntu Natty):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Canonical Desktop Team (canonical-desktop-team)
milestone: none → natty-alpha-3
Martin Pitt (pitti) on 2011-02-15
Changed in firefox (Ubuntu Natty):
assignee: Canonical Desktop Team (canonical-desktop-team) → Chris Coulson (chrisccoulson)
Martin Pitt (pitti) wrote :

This isn't actually a release blocker, as the current non-PIE package works. As this is a natty task, downgrading priority.

Changed in gcc-4.5 (Ubuntu Natty):
milestone: natty-alpha-3 → ubuntu-11.04-beta-1
Changed in firefox (Ubuntu Natty):
milestone: natty-alpha-3 → ubuntu-11.04-beta-1
importance: High → Medium
Changed in gcc-4.5 (Ubuntu Natty):
importance: High → Medium
Martin Pitt (pitti) wrote :

To unblock this, would it be appropriate to build firefox with gcc-4.4 on at least i386 for natty?

On 22.03.2011 18:19, Martin Pitt wrote:
> To unblock this, would it be appropriate to build firefox with gcc-4.4
> on at least i386 for natty?

we won't demote gcc-4.4 in natty, so this should be possible.

Micah Gersten (micahg) on 2011-03-22
tags: added: regression-release
tags: added: natty
Micah Gersten (micahg) wrote :

Thanks doko and pitti
I will do a test rebuild with gcc-4.4 to see if this will solve the issue for natty without introducing other issues and then we can retarget this for O.

Changed in firefox (Ubuntu Natty):
milestone: ubuntu-11.04-beta-1 → ubuntu-11.04-beta-2
Changed in gcc-4.5 (Ubuntu Natty):
milestone: ubuntu-11.04-beta-1 → ubuntu-11.04-beta-2
Chris Coulson (chrisccoulson) wrote :
Download full text (5.3 KiB)

Ok, this looks like a linker bug.

The information that is wrong in the final executable isn't coming from the compiler, but is added by the linker (although the relocations in the .rel.text section are coming from the compiler). So, I did two builds of Firefox again - 1 with gcc-4.5 and 1 with gcc-4.4 to compare the jemalloc.o objects. There is one obvious difference - gcc-4.4 outputs an object using global-dynamic TLS (I see lots of R_386_TLS_GD relocations .rel.text where arenas_map is accessed), and gcc-4.5 seems to be outputting an object with local-dynamic TLS access model (I see lots of R_386_TLS_LDM and R_386_TLS_LDO_32 relocations).

So, I've managed to write a small reproducer now:

#include <stdlib.h>
#include <stdio.h>

static __thread int a;
static int *c;

int main(int argc, char *argv[])
{
 a = 2;
 c = &a;
 printf("c=%d\n", *c);
}

You can save this as test.c and compile in various ways:

- "gcc -o test -fPIC test.c" - This generates an intermediate object with global-dynamic TLS (you can verify this if you split the compile/link steps and inspect the resulting object with readelf) which is passed to the linker, and works properly

- "gcc -o test -fPIC -pie test.c" - This generates a pie which works correctly

- "gcc -o test -ftls-model=local-dynamic -fPIC test.c" - This generates an intermediate object with local-dynamic TLS (again, verified by splitting the compile/link steps and inspecting the object with readelf), and the linker generates a working binary

- "gcc -o test -ftls-model=local-dynamic -fPIC -pie test.c" - The resulting binary crashes, and inspecting the resulting binary shows the same issue we have with Firefox (it's accessing the wrong location where it should be accessing the TLS variable)

Disassembling main from the broken binary shows:

000005ac <main>:
 5ac: 55 push %ebp
 5ad: 89 e5 mov %esp,%ebp
 5af: 83 e4 f0 and $0xfffffff0,%esp
 5b2: 56 push %esi
 5b3: 53 push %ebx
 5b4: 83 ec 18 sub $0x18,%esp
 5b7: e8 eb ff ff ff call 5a7 <__i686.get_pc_thunk.bx>
 5bc: 81 c3 38 1a 00 00 add $0x1a38,%ebx
 5c2: 65 a1 00 00 00 00 mov %gs:0x0,%eax
 5c8: 90 nop
 5c9: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
 5cd: 89 c6 mov %eax,%esi
 5cf: 8d 05 00 00 00 00 lea 0x0,%eax
 5d5: 8d 04 06 lea (%esi,%eax,1),%eax
 5d8: c7 00 02 00 00 00 movl $0x2,(%eax)
 5de: 65 a1 00 00 00 00 mov %gs:0x0,%eax
 5e4: 90 nop
 5e5: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
 5e9: 89 c6 mov %eax,%esi
 5eb: 8d 05 00 00 00 00 lea 0x0,%eax
 5f1: 8d 04 06 lea (%esi,%eax,1),%eax
 5f4: 89 83 30 00 00 00 mov %eax,0x30(%ebx)
 5fa: 8b 83 30 00 00 00 mov 0x30(%ebx),%eax
 600: 8b 10 mov (%eax),%edx
 602: 8d 83 f8 e6 ff ff lea -0x1908(%ebx),%eax
 608: 89 54 24 04 mov %edx,0x4(%esp)
 60c: 89 04 24 mov %eax,(%esp)
 60f: e8 70 fe ff ff call 484 ...

Read more...

affects: gcc-4.5 (Ubuntu Natty) → binutils (Ubuntu Natty)
Chris Coulson (chrisccoulson) wrote :

Note, the broken (pie) binary shows:

 5eb: 8d 05 00 00 00 00 lea 0x0,%eax

...where, the working (non pie) binary shows:

 8048493: 8d 05 fc ff ff ff lea 0xfffffffc,%eax

Chris Coulson (chrisccoulson) wrote :

Output of readelf -a for test.o built with gcc -c -o test.o -ftls-model=local-dynamic -fPIC test.c

Chris Coulson (chrisccoulson) wrote :

Output of readelf -a for test.o built with gcc -c -o test.o -fPIC test.c

Chris Coulson (chrisccoulson) wrote :

For those who are curious, here is the output of "readelf -a jemalloc.o" and "objdump -d jemalloc.o" for both gcc-4.4 and gcc-4.5

Chris Coulson (chrisccoulson) wrote :

Note, I just hacked the intermediate test.o object and inserted 4 bytes of garbage from offset 0x25 from the start of the .text section (which is where the linker applies a R_386_TLS_LDO_32 relocation, and seems to be the one which isn't applied correctly). Then I ran the linker on it, and confirmed that the relocation is actually applied incorrectly (rather than not being applied at all), as the garbage i inserted in to the intermediate object was overwritten.

So, it just seems to be that R_386_TLS_LDO_32 relocations aren't handled properly with -pie

Martin Pitt (pitti) wrote :

I ran Chris' test cases in sid/experimental:

gcc 4.5.2-8 (sid) and binutils 2.21.0.20110327-2 (sid): segfaults
gcc 4.6.0-2 (sid) and binutils 2.21.0.20110327-2 (sid): segfaults

gcc 4.5.2-8 (sid) and binutils 2.21.51.20110403-1 (experimental): segfaults
gcc 4.6.0-2 (sid) and binutils 2.21.51.20110403-1 (experimental): segfaults

There is no newer 4.5/4.6 gcc in experimental to try.

Martin Pitt (pitti) on 2011-04-07
description: updated
Download full text (5.0 KiB)

[ forwarded from https://launchpad.net/bugs/663294 ]

seen with the 2.21 branch and trunk, not seen when using gold (2.21 branch) as the linker. The original issue was seen with a firefox build. Report from Chris Coulson:

The information that is wrong in the final executable isn't coming from the compiler, but is added by the linker (although the relocations in the .rel.text section are coming from the compiler). So, I did two builds of Firefox again - 1 with gcc-4.5 and 1 with gcc-4.4 to compare the jemalloc.o objects. There is one obvious difference - gcc-4.4 outputs an object using global-dynamic TLS (I see lots of R_386_TLS_GD relocations .rel.text where arenas_map is accessed), and gcc-4.5 seems to be outputting an object with local-dynamic TLS access model (I see lots of R_386_TLS_LDM and R_386_TLS_LDO_32 relocations).

So, I've managed to write a small reproducer now:

#include <stdlib.h>
#include <stdio.h>

static __thread int a;
static int *c;

int main(int argc, char *argv[])
{
 a = 2;
 c = &a;
 printf("c=%d\n", *c);
}

You can save this as test.c and compile in various ways:

- "gcc -o test -fPIC test.c" - This generates an intermediate object with global-dynamic TLS (you can verify this if you split the compile/link steps and inspect the resulting object with readelf) which is passed to the linker, and works properly

- "gcc -o test -fPIC -pie test.c" - This generates a pie which works correctly

- "gcc -o test -ftls-model=local-dynamic -fPIC test.c" - This generates an intermediate object with local-dynamic TLS (again, verified by splitting the compile/link steps and inspecting the object with readelf), and the linker generates a working binary

- "gcc -o test -ftls-model=local-dynamic -fPIC -pie test.c" - The resulting binary crashes, and inspecting the resulting binary shows the same issue we have with Firefox (it's accessing the wrong location where it should be accessing the TLS variable)

Disassembling main from the broken binary shows:

000005ac <main>:
 5ac: 55 push %ebp
 5ad: 89 e5 mov %esp,%ebp
 5af: 83 e4 f0 and $0xfffffff0,%esp
 5b2: 56 push %esi
 5b3: 53 push %ebx
 5b4: 83 ec 18 sub $0x18,%esp
 5b7: e8 eb ff ff ff call 5a7 <__i686.get_pc_thunk.bx>
 5bc: 81 c3 38 1a 00 00 add $0x1a38,%ebx
 5c2: 65 a1 00 00 00 00 mov %gs:0x0,%eax
 5c8: 90 nop
 5c9: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
 5cd: 89 c6 mov %eax,%esi
 5cf: 8d 05 00 00 00 00 lea 0x0,%eax
 5d5: 8d 04 06 lea (%esi,%eax,1),%eax
 5d8: c7 00 02 00 00 00 movl $0x2,(%eax)
 5de: 65 a1 00 00 00 00 mov %gs:0x0,%eax
 5e4: 90 nop
 5e5: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
 5e9: 89 c6 mov %eax,%esi
 5eb: 8d 05 00 00 00 00 lea 0x0,%eax
 5f1: 8d 04 06 lea (%esi,%eax,1),%eax
 5f4: 89 83 30 00 00 00 mov %eax,0x30(%ebx)
 5fa: 8b 83 30 00 00 00 mov 0x30(%ebx),%eax
 600: 8b 10 mov (%eax),%edx
 602: 8d 83 f8 e6 ff ff lea -0x1908(%ebx),%eax
 608: 89 54 24 04 mov %edx,0x4(%esp)
 60c: 89 04 24 mov %eax,(%esp)
 60f: e8 70 fe ff ff call 484 <printf@plt>
 614: 83 c4 18 add $0x18,%esp
 617: 5b pop %ebx
 618: 5e pop %esi
 619: 89 ec mov %ebp,%esp
 61b: 5d pop %ebp
 61c: c3 ret
 61d: 90 nop
 61e: 90 nop
 61f: 90 nop

And from the working binary:

08048454 <main>:
 8048454: 55 push %ebp
...

Read more...

The code in elf_i386_relocate_section for R_386_TLS_LDO_32 checks info->shared where it should probably check !info->executable.

(In reply to comment #1)
> The code in elf_i386_relocate_section for R_386_TLS_LDO_32 checks info->shared
> where it should probably check !info->executable.

Thanks, Ian. I will check in a fix soon.

Colin Watson (cjwatson) on 2011-04-08
Changed in binutils (Ubuntu Natty):
assignee: Canonical Foundations Team (canonical-foundations) → Matthias Klose (doko)

CVSROOT: /cvs/src
Module name: src
Changes by: <email address hidden> 2011-04-08 16:14:49

Modified files:
 bfd : ChangeLog elf32-i386.c
 ld/testsuite : ChangeLog
 ld/testsuite/ld-i386: i386.exp
Added files:
 ld/testsuite/ld-i386: tlspie2.d tlspie2.s

Log message:
 Properly handle R_386_TLS_LDO_32 for PIE.

 bfd/

 2011-04-08 H.J. Lu <email address hidden>

 PR ld/12654
 * elf32-i386.c (elf_i386_relocate_section): Check !executable
 instead of shared for R_386_TLS_LDO_32.

 ld/testsuite/

 2011-04-08 H.J. Lu <email address hidden>

 PR ld/12654
 * ld-i386/i386.exp: Run tlspie2.

 * ld-i386/tlspie2.d: New.
 * ld-i386/tlspie2.s: Likewise.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.5293&r2=1.5294
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/elf32-i386.c.diff?cvsroot=src&r1=1.243&r2=1.244
http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/testsuite/ChangeLog.diff?cvsroot=src&r1=1.1382&r2=1.1383
http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/testsuite/ld-i386/tlspie2.d.diff?cvsroot=src&r1=NONE&r2=1.1
http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/testsuite/ld-i386/tlspie2.s.diff?cvsroot=src&r1=NONE&r2=1.1
http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/testsuite/ld-i386/i386.exp.diff?cvsroot=src&r1=1.36&r2=1.37

Thanks! That does the trick, and I have a working -pie build of firefox now too

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package binutils - 2.21.0.20110327-2ubuntu2

---------------
binutils (2.21.0.20110327-2ubuntu2) natty; urgency=low

  * Fix architecture field for binutils-gold (powerpcspe).
  * Add support for arm-*-gnueabihf targets. Closes: #621029.
  * Fix PR ld/12654: Pproperly handle R_386_TLS_LDO_32 for PIE. LP: #663294.
 -- Matthias Klose <email address hidden> Sat, 09 Apr 2011 13:35:11 +0200

Changed in binutils (Ubuntu Natty):
status: Triaged → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package firefox - 4.0+nobinonly-0ubuntu1

---------------
firefox (4.0+nobinonly-0ubuntu1) natty; urgency=low

  * Bump the version number to 4.0 final. This is just so we stop confusing
    people with rc2 in the version number, but the tarball is identical

  * Update globalmenu-extension code to 1.0
    - Drop the hacks we had to workaround the lack of menu closed signals from
      Unity. We no longer synthesize our own menu closed events, but use the
      proper event from dbusmenu.
    - In addition to this, we split the work done during menu opening in to
      2 phases - the first phase triggered off "about-to-show" and the second
      phase triggered off "opened". In the future, we should be able to delay
      the menu opening with the about-to-show handler, which is where we
      do all the work to update the menu
    - Fix LP: #755701 - When iterating over each label, check if the current
      character equals the access key (to set a flag indicating we've already
      seen it) *before* we mangle the label, rather than after it.
      This fixes an issue where we never set the flag, and subsequently see
      the mnemonics repeated multiple times in labels
    - Also remove the hack we had to add placeholder entries to empty menus
      so that we could get an about-to-show signal (now that LP: #619811
      is fixed)

  * Fix typo in German translation of static quicklist item (LP: #750220)
    - update debian/firefox-*.desktop.in
  * Re-enable -pie (LP: #663294)
   - update debian/rules
 -- Chris Coulson <email address hidden> Mon, 11 Apr 2011 01:07:23 +0100

Changed in firefox (Ubuntu Natty):
status: Triaged → Fix Released
Changed in binutils:
importance: Unknown → Medium
status: Unknown → Fix Released
bharankumar (bharan-428) wrote :

yes ,i want change my bug

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.