Activity log for bug #663294

Date Who What changed Old value New value Message
2010-10-19 14:03:09 Chris Coulson bug added bug
2010-10-19 14:12:36 Micah Gersten bug added subscriber Micah Gersten
2010-10-19 15:07:19 John Vivirito gcc-4.5 (Ubuntu): status New Confirmed
2010-10-19 18:15:11 Chris Coulson description Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) { assert(size != 0); if (size <= arena_maxclass) return (arena_malloc(choose_arena(), size, false)); else return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000", __align = 0} The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled: 0xf7ff66a0 <+0>: push %ebp 0xf7ff66a1 <+1>: mov %esp,%ebp 0xf7ff66a3 <+3>: sub $0x58,%esp 0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp) 0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx> 0xf7ff66ae <+14>: add $0x880e,%ebx 0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp) 0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi 0xf7ff66ba <+26>: mov %edi,-0x4(%ebp) 0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx) 0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88> 0xf7ff66c6 <+38>: test %esi,%esi 0xf7ff66c8 <+40>: mov $0x1,%eax 0xf7ff66cd <+45>: cmove %eax,%esi 0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi 0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128> 0xf7ff66d8 <+56>: mov %esi,%eax 0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc> 0xf7ff66df <+63>: mov %eax,%esi 0xf7ff66e1 <+65>: test %esi,%esi 0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97> 0xf7ff66e5 <+69>: mov %esi,%eax 0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx 0xf7ff66ea <+74>: mov -0x8(%ebp),%esi 0xf7ff66ed <+77>: mov -0x4(%ebp),%edi 0xf7ff66f0 <+80>: mov %ebp,%esp 0xf7ff66f2 <+82>: pop %ebp 0xf7ff66f3 <+83>: ret 0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard> 0xf7ff66fd <+93>: test %al,%al 0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38> 0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt> 0xf7ff6706 <+102>: xor %esi,%esi 0xf7ff6708 <+104>: movl $0xc,(%eax) 0xf7ff670e <+110>: mov %esi,%eax 0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx 0xf7ff6713 <+115>: mov -0x8(%ebp),%esi 0xf7ff6716 <+118>: mov -0x4(%ebp),%edi 0xf7ff6719 <+121>: mov %ebp,%esp 0xf7ff671b <+123>: pop %ebp 0xf7ff671c <+124>: ret 0xf7ff671d <+125>: lea 0x0(%esi),%esi 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff6726 <+134>: nop 0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp) 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi 0xf7ff6737 <+151>: test %edi,%edi 0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704> 0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi 0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512> 0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi 0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440> 0xf7ff6757 <+183>: sub $0x1,%esi 0xf7ff675a <+186>: mov %esi,%edx 0xf7ff675c <+188>: shr %edx 0xf7ff675e <+190>: or %esi,%edx 0xf7ff6760 <+192>: mov %edx,%eax 0xf7ff6762 <+194>: shr $0x2,%eax 0xf7ff6765 <+197>: or %edx,%eax 0xf7ff6767 <+199>: mov %eax,%edx 0xf7ff6769 <+201>: shr $0x4,%edx 0xf7ff676c <+204>: or %eax,%edx 0xf7ff676e <+206>: mov %edx,%eax 0xf7ff6770 <+208>: shr $0x8,%eax 0xf7ff6773 <+211>: or %edx,%eax 0xf7ff6775 <+213>: mov $0xffffffff,%edx 0xf7ff677a <+218>: mov %eax,%esi 0xf7ff677c <+220>: shr $0x10,%esi 0xf7ff677f <+223>: or %eax,%esi 0xf7ff6781 <+225>: add $0x1,%esi 0xf7ff6784 <+228>: mov %esi,%eax 0xf7ff6786 <+230>: shr $0x2,%eax 0xf7ff6789 <+233>: bsf %eax,%eax 0xf7ff678c <+236>: cmove %edx,%eax 0xf7ff678f <+239>: cmp $0x1,%esi 0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax 0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax 0xf7ff679d <+253>: mov %eax,-0x1c(%ebp) 0xf7ff67a0 <+256>: mov $0x2,%eax 0xf7ff67a5 <+261>: cmova %esi,%eax 0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp) 0xf7ff67ab <+267>: mov %edi,(%esp) 0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then passed to put on the stack and passed to pthread_mutex_lock: 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) {  assert(size != 0);  if (size <= arena_maxclass)   return (arena_malloc(choose_arena(), size, false));  else   return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",   __align = 0} The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled:    0xf7ff66a0 <+0>: push %ebp    0xf7ff66a1 <+1>: mov %esp,%ebp    0xf7ff66a3 <+3>: sub $0x58,%esp    0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)    0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>    0xf7ff66ae <+14>: add $0x880e,%ebx    0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)    0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi    0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)    0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)    0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>    0xf7ff66c6 <+38>: test %esi,%esi    0xf7ff66c8 <+40>: mov $0x1,%eax    0xf7ff66cd <+45>: cmove %eax,%esi    0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi    0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>    0xf7ff66d8 <+56>: mov %esi,%eax    0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>    0xf7ff66df <+63>: mov %eax,%esi    0xf7ff66e1 <+65>: test %esi,%esi    0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>    0xf7ff66e5 <+69>: mov %esi,%eax    0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx    0xf7ff66ea <+74>: mov -0x8(%ebp),%esi    0xf7ff66ed <+77>: mov -0x4(%ebp),%edi    0xf7ff66f0 <+80>: mov %ebp,%esp    0xf7ff66f2 <+82>: pop %ebp    0xf7ff66f3 <+83>: ret    0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>    0xf7ff66fd <+93>: test %al,%al    0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>    0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>    0xf7ff6706 <+102>: xor %esi,%esi    0xf7ff6708 <+104>: movl $0xc,(%eax)    0xf7ff670e <+110>: mov %esi,%eax    0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx    0xf7ff6713 <+115>: mov -0x8(%ebp),%esi    0xf7ff6716 <+118>: mov -0x4(%ebp),%edi    0xf7ff6719 <+121>: mov %ebp,%esp    0xf7ff671b <+123>: pop %ebp    0xf7ff671c <+124>: ret    0xf7ff671d <+125>: lea 0x0(%esi),%esi    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff6726 <+134>: nop    0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi    0xf7ff6737 <+151>: test %edi,%edi    0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>    0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi    0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>    0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi    0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>    0xf7ff6757 <+183>: sub $0x1,%esi    0xf7ff675a <+186>: mov %esi,%edx    0xf7ff675c <+188>: shr %edx    0xf7ff675e <+190>: or %esi,%edx    0xf7ff6760 <+192>: mov %edx,%eax    0xf7ff6762 <+194>: shr $0x2,%eax    0xf7ff6765 <+197>: or %edx,%eax    0xf7ff6767 <+199>: mov %eax,%edx    0xf7ff6769 <+201>: shr $0x4,%edx    0xf7ff676c <+204>: or %eax,%edx    0xf7ff676e <+206>: mov %edx,%eax    0xf7ff6770 <+208>: shr $0x8,%eax    0xf7ff6773 <+211>: or %edx,%eax    0xf7ff6775 <+213>: mov $0xffffffff,%edx    0xf7ff677a <+218>: mov %eax,%esi    0xf7ff677c <+220>: shr $0x10,%esi    0xf7ff677f <+223>: or %eax,%esi    0xf7ff6781 <+225>: add $0x1,%esi    0xf7ff6784 <+228>: mov %esi,%eax    0xf7ff6786 <+230>: shr $0x2,%eax    0xf7ff6789 <+233>: bsf %eax,%eax    0xf7ff678c <+236>: cmove %edx,%eax    0xf7ff678f <+239>: cmp $0x1,%esi    0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax    0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax    0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)    0xf7ff67a0 <+256>: mov $0x2,%eax    0xf7ff67a5 <+261>: cmova %esi,%eax    0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)    0xf7ff67ab <+267>: mov %edi,(%esp)    0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
2010-10-20 15:19:34 Chris Coulson bug added subscriber Kees Cook
2010-10-20 15:19:38 Chris Coulson gcc-4.5 (Ubuntu): importance Undecided High
2010-10-29 20:43:26 Reed Loden bug added subscriber Reed Loden
2010-10-31 21:29:03 Philip Muškovac bug added subscriber Philip Muškovac
2010-10-31 23:49:44 Fabien Tassin bug added subscriber Fabien Tassin
2010-11-02 22:09:48 Chris Coulson description Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) {  assert(size != 0);  if (size <= arena_maxclass)   return (arena_malloc(choose_arena(), size, false));  else   return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",   __align = 0} The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled:    0xf7ff66a0 <+0>: push %ebp    0xf7ff66a1 <+1>: mov %esp,%ebp    0xf7ff66a3 <+3>: sub $0x58,%esp    0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)    0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>    0xf7ff66ae <+14>: add $0x880e,%ebx    0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)    0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi    0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)    0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)    0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>    0xf7ff66c6 <+38>: test %esi,%esi    0xf7ff66c8 <+40>: mov $0x1,%eax    0xf7ff66cd <+45>: cmove %eax,%esi    0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi    0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>    0xf7ff66d8 <+56>: mov %esi,%eax    0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>    0xf7ff66df <+63>: mov %eax,%esi    0xf7ff66e1 <+65>: test %esi,%esi    0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>    0xf7ff66e5 <+69>: mov %esi,%eax    0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx    0xf7ff66ea <+74>: mov -0x8(%ebp),%esi    0xf7ff66ed <+77>: mov -0x4(%ebp),%edi    0xf7ff66f0 <+80>: mov %ebp,%esp    0xf7ff66f2 <+82>: pop %ebp    0xf7ff66f3 <+83>: ret    0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>    0xf7ff66fd <+93>: test %al,%al    0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>    0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>    0xf7ff6706 <+102>: xor %esi,%esi    0xf7ff6708 <+104>: movl $0xc,(%eax)    0xf7ff670e <+110>: mov %esi,%eax    0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx    0xf7ff6713 <+115>: mov -0x8(%ebp),%esi    0xf7ff6716 <+118>: mov -0x4(%ebp),%edi    0xf7ff6719 <+121>: mov %ebp,%esp    0xf7ff671b <+123>: pop %ebp    0xf7ff671c <+124>: ret    0xf7ff671d <+125>: lea 0x0(%esi),%esi    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff6726 <+134>: nop    0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi    0xf7ff6737 <+151>: test %edi,%edi    0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>    0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi    0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>    0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi    0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>    0xf7ff6757 <+183>: sub $0x1,%esi    0xf7ff675a <+186>: mov %esi,%edx    0xf7ff675c <+188>: shr %edx    0xf7ff675e <+190>: or %esi,%edx    0xf7ff6760 <+192>: mov %edx,%eax    0xf7ff6762 <+194>: shr $0x2,%eax    0xf7ff6765 <+197>: or %edx,%eax    0xf7ff6767 <+199>: mov %eax,%edx    0xf7ff6769 <+201>: shr $0x4,%edx    0xf7ff676c <+204>: or %eax,%edx    0xf7ff676e <+206>: mov %edx,%eax    0xf7ff6770 <+208>: shr $0x8,%eax    0xf7ff6773 <+211>: or %edx,%eax    0xf7ff6775 <+213>: mov $0xffffffff,%edx    0xf7ff677a <+218>: mov %eax,%esi    0xf7ff677c <+220>: shr $0x10,%esi    0xf7ff677f <+223>: or %eax,%esi    0xf7ff6781 <+225>: add $0x1,%esi    0xf7ff6784 <+228>: mov %esi,%eax    0xf7ff6786 <+230>: shr $0x2,%eax    0xf7ff6789 <+233>: bsf %eax,%eax    0xf7ff678c <+236>: cmove %edx,%eax    0xf7ff678f <+239>: cmp $0x1,%esi    0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax    0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax    0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)    0xf7ff67a0 <+256>: mov $0x2,%eax    0xf7ff67a5 <+261>: cmova %esi,%eax    0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)    0xf7ff67ab <+267>: mov %edi,(%esp)    0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) { assert(size != 0); if (size <= arena_maxclass) return (arena_malloc(choose_arena(), size, false)); else return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000", __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as the arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled: 0xf7ff66a0 <+0>: push %ebp 0xf7ff66a1 <+1>: mov %esp,%ebp 0xf7ff66a3 <+3>: sub $0x58,%esp 0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp) 0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx> 0xf7ff66ae <+14>: add $0x880e,%ebx 0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp) 0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi 0xf7ff66ba <+26>: mov %edi,-0x4(%ebp) 0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx) 0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88> 0xf7ff66c6 <+38>: test %esi,%esi 0xf7ff66c8 <+40>: mov $0x1,%eax 0xf7ff66cd <+45>: cmove %eax,%esi 0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi 0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128> 0xf7ff66d8 <+56>: mov %esi,%eax 0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc> 0xf7ff66df <+63>: mov %eax,%esi 0xf7ff66e1 <+65>: test %esi,%esi 0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97> 0xf7ff66e5 <+69>: mov %esi,%eax 0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx 0xf7ff66ea <+74>: mov -0x8(%ebp),%esi 0xf7ff66ed <+77>: mov -0x4(%ebp),%edi 0xf7ff66f0 <+80>: mov %ebp,%esp 0xf7ff66f2 <+82>: pop %ebp 0xf7ff66f3 <+83>: ret 0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard> 0xf7ff66fd <+93>: test %al,%al 0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38> 0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt> 0xf7ff6706 <+102>: xor %esi,%esi 0xf7ff6708 <+104>: movl $0xc,(%eax) 0xf7ff670e <+110>: mov %esi,%eax 0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx 0xf7ff6713 <+115>: mov -0x8(%ebp),%esi 0xf7ff6716 <+118>: mov -0x4(%ebp),%edi 0xf7ff6719 <+121>: mov %ebp,%esp 0xf7ff671b <+123>: pop %ebp 0xf7ff671c <+124>: ret 0xf7ff671d <+125>: lea 0x0(%esi),%esi 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff6726 <+134>: nop 0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp) 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi 0xf7ff6737 <+151>: test %edi,%edi 0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704> 0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi 0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512> 0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi 0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440> 0xf7ff6757 <+183>: sub $0x1,%esi 0xf7ff675a <+186>: mov %esi,%edx 0xf7ff675c <+188>: shr %edx 0xf7ff675e <+190>: or %esi,%edx 0xf7ff6760 <+192>: mov %edx,%eax 0xf7ff6762 <+194>: shr $0x2,%eax 0xf7ff6765 <+197>: or %edx,%eax 0xf7ff6767 <+199>: mov %eax,%edx 0xf7ff6769 <+201>: shr $0x4,%edx 0xf7ff676c <+204>: or %eax,%edx 0xf7ff676e <+206>: mov %edx,%eax 0xf7ff6770 <+208>: shr $0x8,%eax 0xf7ff6773 <+211>: or %edx,%eax 0xf7ff6775 <+213>: mov $0xffffffff,%edx 0xf7ff677a <+218>: mov %eax,%esi 0xf7ff677c <+220>: shr $0x10,%esi 0xf7ff677f <+223>: or %eax,%esi 0xf7ff6781 <+225>: add $0x1,%esi 0xf7ff6784 <+228>: mov %esi,%eax 0xf7ff6786 <+230>: shr $0x2,%eax 0xf7ff6789 <+233>: bsf %eax,%eax 0xf7ff678c <+236>: cmove %edx,%eax 0xf7ff678f <+239>: cmp $0x1,%esi 0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax 0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax 0xf7ff679d <+253>: mov %eax,-0x1c(%ebp) 0xf7ff67a0 <+256>: mov $0x2,%eax 0xf7ff67a5 <+261>: cmova %esi,%eax 0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp) 0xf7ff67ab <+267>: mov %edi,(%esp) 0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock: 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
2010-11-02 22:13:07 Chris Coulson description Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) { assert(size != 0); if (size <= arena_maxclass) return (arena_malloc(choose_arena(), size, false)); else return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000", __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as the arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}}, __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled: 0xf7ff66a0 <+0>: push %ebp 0xf7ff66a1 <+1>: mov %esp,%ebp 0xf7ff66a3 <+3>: sub $0x58,%esp 0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp) 0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx> 0xf7ff66ae <+14>: add $0x880e,%ebx 0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp) 0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi 0xf7ff66ba <+26>: mov %edi,-0x4(%ebp) 0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx) 0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88> 0xf7ff66c6 <+38>: test %esi,%esi 0xf7ff66c8 <+40>: mov $0x1,%eax 0xf7ff66cd <+45>: cmove %eax,%esi 0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi 0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128> 0xf7ff66d8 <+56>: mov %esi,%eax 0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc> 0xf7ff66df <+63>: mov %eax,%esi 0xf7ff66e1 <+65>: test %esi,%esi 0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97> 0xf7ff66e5 <+69>: mov %esi,%eax 0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx 0xf7ff66ea <+74>: mov -0x8(%ebp),%esi 0xf7ff66ed <+77>: mov -0x4(%ebp),%edi 0xf7ff66f0 <+80>: mov %ebp,%esp 0xf7ff66f2 <+82>: pop %ebp 0xf7ff66f3 <+83>: ret 0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard> 0xf7ff66fd <+93>: test %al,%al 0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38> 0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt> 0xf7ff6706 <+102>: xor %esi,%esi 0xf7ff6708 <+104>: movl $0xc,(%eax) 0xf7ff670e <+110>: mov %esi,%eax 0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx 0xf7ff6713 <+115>: mov -0x8(%ebp),%esi 0xf7ff6716 <+118>: mov -0x4(%ebp),%edi 0xf7ff6719 <+121>: mov %ebp,%esp 0xf7ff671b <+123>: pop %ebp 0xf7ff671c <+124>: ret 0xf7ff671d <+125>: lea 0x0(%esi),%esi 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff6726 <+134>: nop 0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp) 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi 0xf7ff6737 <+151>: test %edi,%edi 0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704> 0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi 0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512> 0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi 0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440> 0xf7ff6757 <+183>: sub $0x1,%esi 0xf7ff675a <+186>: mov %esi,%edx 0xf7ff675c <+188>: shr %edx 0xf7ff675e <+190>: or %esi,%edx 0xf7ff6760 <+192>: mov %edx,%eax 0xf7ff6762 <+194>: shr $0x2,%eax 0xf7ff6765 <+197>: or %edx,%eax 0xf7ff6767 <+199>: mov %eax,%edx 0xf7ff6769 <+201>: shr $0x4,%edx 0xf7ff676c <+204>: or %eax,%edx 0xf7ff676e <+206>: mov %edx,%eax 0xf7ff6770 <+208>: shr $0x8,%eax 0xf7ff6773 <+211>: or %edx,%eax 0xf7ff6775 <+213>: mov $0xffffffff,%edx 0xf7ff677a <+218>: mov %eax,%esi 0xf7ff677c <+220>: shr $0x10,%esi 0xf7ff677f <+223>: or %eax,%esi 0xf7ff6781 <+225>: add $0x1,%esi 0xf7ff6784 <+228>: mov %esi,%eax 0xf7ff6786 <+230>: shr $0x2,%eax 0xf7ff6789 <+233>: bsf %eax,%eax 0xf7ff678c <+236>: cmove %edx,%eax 0xf7ff678f <+239>: cmp $0x1,%esi 0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax 0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax 0xf7ff679d <+253>: mov %eax,-0x1c(%ebp) 0xf7ff67a0 <+256>: mov $0x2,%eax 0xf7ff67a5 <+261>: cmova %esi,%eax 0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp) 0xf7ff67ab <+267>: mov %edi,(%esp) 0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock: 0xf7ff6720 <+128>: mov %gs:0x0,%eax 0xf7ff672b <+139>: lea 0x0,%edx 0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) {  assert(size != 0);  if (size <= arena_maxclass)   return (arena_malloc(choose_arena(), size, false));  else   return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",   __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled:    0xf7ff66a0 <+0>: push %ebp    0xf7ff66a1 <+1>: mov %esp,%ebp    0xf7ff66a3 <+3>: sub $0x58,%esp    0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)    0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>    0xf7ff66ae <+14>: add $0x880e,%ebx    0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)    0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi    0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)    0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)    0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>    0xf7ff66c6 <+38>: test %esi,%esi    0xf7ff66c8 <+40>: mov $0x1,%eax    0xf7ff66cd <+45>: cmove %eax,%esi    0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi    0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>    0xf7ff66d8 <+56>: mov %esi,%eax    0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>    0xf7ff66df <+63>: mov %eax,%esi    0xf7ff66e1 <+65>: test %esi,%esi    0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>    0xf7ff66e5 <+69>: mov %esi,%eax    0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx    0xf7ff66ea <+74>: mov -0x8(%ebp),%esi    0xf7ff66ed <+77>: mov -0x4(%ebp),%edi    0xf7ff66f0 <+80>: mov %ebp,%esp    0xf7ff66f2 <+82>: pop %ebp    0xf7ff66f3 <+83>: ret    0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>    0xf7ff66fd <+93>: test %al,%al    0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>    0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>    0xf7ff6706 <+102>: xor %esi,%esi    0xf7ff6708 <+104>: movl $0xc,(%eax)    0xf7ff670e <+110>: mov %esi,%eax    0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx    0xf7ff6713 <+115>: mov -0x8(%ebp),%esi    0xf7ff6716 <+118>: mov -0x4(%ebp),%edi    0xf7ff6719 <+121>: mov %ebp,%esp    0xf7ff671b <+123>: pop %ebp    0xf7ff671c <+124>: ret    0xf7ff671d <+125>: lea 0x0(%esi),%esi    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff6726 <+134>: nop    0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi    0xf7ff6737 <+151>: test %edi,%edi    0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>    0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi    0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>    0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi    0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>    0xf7ff6757 <+183>: sub $0x1,%esi    0xf7ff675a <+186>: mov %esi,%edx    0xf7ff675c <+188>: shr %edx    0xf7ff675e <+190>: or %esi,%edx    0xf7ff6760 <+192>: mov %edx,%eax    0xf7ff6762 <+194>: shr $0x2,%eax    0xf7ff6765 <+197>: or %edx,%eax    0xf7ff6767 <+199>: mov %eax,%edx    0xf7ff6769 <+201>: shr $0x4,%edx    0xf7ff676c <+204>: or %eax,%edx    0xf7ff676e <+206>: mov %edx,%eax    0xf7ff6770 <+208>: shr $0x8,%eax    0xf7ff6773 <+211>: or %edx,%eax    0xf7ff6775 <+213>: mov $0xffffffff,%edx    0xf7ff677a <+218>: mov %eax,%esi    0xf7ff677c <+220>: shr $0x10,%esi    0xf7ff677f <+223>: or %eax,%esi    0xf7ff6781 <+225>: add $0x1,%esi    0xf7ff6784 <+228>: mov %esi,%eax    0xf7ff6786 <+230>: shr $0x2,%eax    0xf7ff6789 <+233>: bsf %eax,%eax    0xf7ff678c <+236>: cmove %edx,%eax    0xf7ff678f <+239>: cmp $0x1,%esi    0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax    0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax    0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)    0xf7ff67a0 <+256>: mov $0x2,%eax    0xf7ff67a5 <+261>: cmova %esi,%eax    0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)    0xf7ff67ab <+267>: mov %edi,(%esp)    0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
2010-11-02 22:19:56 Chris Coulson attachment added firefox-bin.gcc-snapshot https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721156/+files/firefox-bin.gcc-snapshot
2010-11-02 22:48:16 Chris Coulson attachment added firefox-bin.no-pie-4.5 https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721185/+files/firefox-bin.no-pie-4.5
2010-11-02 23:04:43 Chris Coulson attachment added firefox-4.0-bin-4.4 https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721188/+files/firefox-4.0-bin-4.4
2010-11-02 23:04:59 Chris Coulson bug added subscriber Matthias Klose
2010-11-02 23:05:10 Chris Coulson summary Firefox built with gcc-4.5 is a non-starter on i386 Firefox built with gcc-4.5 is a non-starter on i386 with -pie
2010-11-03 14:12:31 Launchpad Janitor branch linked lp:firefox
2010-11-04 15:34:32 Launchpad Janitor branch linked lp:~mozillateam/xulrunner/xulrunner-2.0.head
2010-11-05 11:21:53 Launchpad Janitor branch linked lp:firefox/3.6
2010-11-07 23:00:51 Launchpad Janitor branch linked lp:xulrunner/1.9.2
2010-11-16 00:14:28 Launchpad Janitor branch linked lp:firefox/4.0
2010-11-17 22:20:52 Launchpad Janitor branch linked lp:ubuntu/firefox
2010-11-19 19:36:51 Launchpad Janitor branch linked lp:ubuntu/xulrunner-2.0
2010-11-22 21:12:40 Launchpad Janitor branch linked lp:ubuntu/xulrunner-1.9.2
2011-01-26 03:29:59 jhonatas matos gcc-4.5 (Ubuntu): status Confirmed Fix Committed
2011-01-26 03:30:24 jhonatas matos branch unlinked lp:ubuntu/firefox
2011-01-26 10:08:28 Chris Coulson gcc-4.5 (Ubuntu): status Fix Committed Triaged
2011-02-15 00:25:38 Kees Cook nominated for series Ubuntu Natty
2011-02-15 00:25:38 Kees Cook bug task added gcc-4.5 (Ubuntu Natty)
2011-02-15 00:25:46 Kees Cook gcc-4.5 (Ubuntu Natty): milestone natty-alpha-3
2011-02-15 00:25:58 Kees Cook gcc-4.5 (Ubuntu Natty): assignee Canonical Desktop Team (canonical-desktop-team)
2011-02-15 00:27:17 Kees Cook gcc-4.5 (Ubuntu Natty): assignee Canonical Desktop Team (canonical-desktop-team) Canonical Foundations Team (canonical-foundations)
2011-02-15 00:30:38 Kees Cook bug task added firefox (Ubuntu)
2011-02-15 00:31:00 Kees Cook firefox (Ubuntu Natty): status New Triaged
2011-02-15 00:31:07 Kees Cook firefox (Ubuntu Natty): importance Undecided High
2011-02-15 00:31:18 Kees Cook firefox (Ubuntu Natty): assignee Canonical Desktop Team (canonical-desktop-team)
2011-02-15 00:31:21 Kees Cook firefox (Ubuntu Natty): milestone natty-alpha-3
2011-02-15 07:38:23 Martin Pitt firefox (Ubuntu Natty): assignee Canonical Desktop Team (canonical-desktop-team) Chris Coulson (chrisccoulson)
2011-02-25 13:23:49 Martin Pitt gcc-4.5 (Ubuntu Natty): milestone natty-alpha-3 ubuntu-11.04-beta-1
2011-02-25 13:23:55 Martin Pitt firefox (Ubuntu Natty): milestone natty-alpha-3 ubuntu-11.04-beta-1
2011-02-25 13:25:11 Martin Pitt firefox (Ubuntu Natty): importance High Medium
2011-02-25 13:25:51 Martin Pitt gcc-4.5 (Ubuntu Natty): importance High Medium
2011-02-25 15:43:23 Marc Deslauriers bug added subscriber Marc Deslauriers
2011-03-22 17:19:21 Martin Pitt bug added subscriber Martin Pitt
2011-03-22 18:36:40 Micah Gersten tags regression-release
2011-03-22 18:36:48 Micah Gersten tags regression-release natty regression-release
2011-04-04 16:55:12 Kate Stewart firefox (Ubuntu Natty): milestone ubuntu-11.04-beta-1 ubuntu-11.04-beta-2
2011-04-04 16:55:16 Kate Stewart gcc-4.5 (Ubuntu Natty): milestone ubuntu-11.04-beta-1 ubuntu-11.04-beta-2
2011-04-07 12:01:35 Chris Coulson affects gcc-4.5 (Ubuntu Natty) binutils (Ubuntu Natty)
2011-04-07 12:07:37 Chris Coulson attachment added Output of readelf -a for test.o with local-dynamic TLS https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993653/+files/readelf.local-dynamic
2011-04-07 12:08:49 Chris Coulson attachment added Output of readelf -a for test.o with global-dynamic TLS https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993654/+files/readelf.global-dynamic
2011-04-07 12:10:36 Chris Coulson attachment added jemalloc.o.info.tar.gz https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993667/+files/jemalloc.o.info.tar.gz
2011-04-07 13:33:31 Martin Pitt description Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) {  assert(size != 0);  if (size <= arena_maxclass)   return (arena_malloc(choose_arena(), size, false));  else   return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",   __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled:    0xf7ff66a0 <+0>: push %ebp    0xf7ff66a1 <+1>: mov %esp,%ebp    0xf7ff66a3 <+3>: sub $0x58,%esp    0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)    0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>    0xf7ff66ae <+14>: add $0x880e,%ebx    0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)    0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi    0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)    0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)    0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>    0xf7ff66c6 <+38>: test %esi,%esi    0xf7ff66c8 <+40>: mov $0x1,%eax    0xf7ff66cd <+45>: cmove %eax,%esi    0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi    0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>    0xf7ff66d8 <+56>: mov %esi,%eax    0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>    0xf7ff66df <+63>: mov %eax,%esi    0xf7ff66e1 <+65>: test %esi,%esi    0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>    0xf7ff66e5 <+69>: mov %esi,%eax    0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx    0xf7ff66ea <+74>: mov -0x8(%ebp),%esi    0xf7ff66ed <+77>: mov -0x4(%ebp),%edi    0xf7ff66f0 <+80>: mov %ebp,%esp    0xf7ff66f2 <+82>: pop %ebp    0xf7ff66f3 <+83>: ret    0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>    0xf7ff66fd <+93>: test %al,%al    0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>    0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>    0xf7ff6706 <+102>: xor %esi,%esi    0xf7ff6708 <+104>: movl $0xc,(%eax)    0xf7ff670e <+110>: mov %esi,%eax    0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx    0xf7ff6713 <+115>: mov -0x8(%ebp),%esi    0xf7ff6716 <+118>: mov -0x4(%ebp),%edi    0xf7ff6719 <+121>: mov %ebp,%esp    0xf7ff671b <+123>: pop %ebp    0xf7ff671c <+124>: ret    0xf7ff671d <+125>: lea 0x0(%esi),%esi    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff6726 <+134>: nop    0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi    0xf7ff6737 <+151>: test %edi,%edi    0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>    0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi    0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>    0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi    0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>    0xf7ff6757 <+183>: sub $0x1,%esi    0xf7ff675a <+186>: mov %esi,%edx    0xf7ff675c <+188>: shr %edx    0xf7ff675e <+190>: or %esi,%edx    0xf7ff6760 <+192>: mov %edx,%eax    0xf7ff6762 <+194>: shr $0x2,%eax    0xf7ff6765 <+197>: or %edx,%eax    0xf7ff6767 <+199>: mov %eax,%edx    0xf7ff6769 <+201>: shr $0x4,%edx    0xf7ff676c <+204>: or %eax,%edx    0xf7ff676e <+206>: mov %edx,%eax    0xf7ff6770 <+208>: shr $0x8,%eax    0xf7ff6773 <+211>: or %edx,%eax    0xf7ff6775 <+213>: mov $0xffffffff,%edx    0xf7ff677a <+218>: mov %eax,%esi    0xf7ff677c <+220>: shr $0x10,%esi    0xf7ff677f <+223>: or %eax,%esi    0xf7ff6781 <+225>: add $0x1,%esi    0xf7ff6784 <+228>: mov %esi,%eax    0xf7ff6786 <+230>: shr $0x2,%eax    0xf7ff6789 <+233>: bsf %eax,%eax    0xf7ff678c <+236>: cmove %edx,%eax    0xf7ff678f <+239>: cmp $0x1,%esi    0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax    0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax    0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)    0xf7ff67a0 <+256>: mov $0x2,%eax    0xf7ff67a5 <+261>: cmova %esi,%eax    0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)    0xf7ff67ab <+267>: mov %edi,(%esp)    0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 NOTE: See comment 17 for a test case, and following comments for analysis. Binary package hint: gcc-4.5 I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only. It is hanging here: #0 0xffffe405 in __kernel_vsyscall () #1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0 #3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61 #4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417 #5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766 #6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845 #7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857 #8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873 #9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76 #10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39 #11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1 #12 0xf3b48f0d in ?? () from /lib/libselinux.so.1 #13 0xf3b37158 in _init () from /lib/libselinux.so.1 #14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70 #15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134 #16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2 ...which is happening even before entering main() If I break in pthread_mutex_lock, and look at the contents of the mutex, I see: $61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} ...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc(): static inline void * imalloc(size_t size) {  assert(size != 0);  if (size <= arena_maxclass)   return (arena_malloc(choose_arena(), size, false));  else   return (huge_malloc(size, false)); Note that in this case, size < arena_maxclass. choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see: (gdb) print arenas[0] $3 = (arena_t *) 0xf394e040 (gdb) print &arenas[0] $4 = (arena_t **) 0xf394e000 (gdb) print arenas[0]->lock $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",   __align = 0} (gdb) print arenas_map $6 = (arena_t *) 0xf394e040 (gdb) print &arenas_map $7 = (arena_t **) 0xf3a4faac The mutex looks correctly initialized here, and is unlocked. If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely: (gdb) print arena $6 = (arena_t *) 0xf3a4fab0 ...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc Also: (gdb) print arena->lock $8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},   __size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728} So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer; (gdb) info registers eax 0xf3a4fab0 -207291728 ecx 0x3 3 edx 0x0 0 ebx 0xf7ffeebc -134222148 esp 0xffffd3d4 0xffffd3d4 ebp 0xffffd42c 0xffffd42c esi 0x160 352 edi 0xf3a4fab0 -207291728 eip 0xf7ff673f 0xf7ff673f <malloc+159> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 Here is the interesting part of malloc disassembled:    0xf7ff66a0 <+0>: push %ebp    0xf7ff66a1 <+1>: mov %esp,%ebp    0xf7ff66a3 <+3>: sub $0x58,%esp    0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)    0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>    0xf7ff66ae <+14>: add $0x880e,%ebx    0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)    0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi    0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)    0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)    0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>    0xf7ff66c6 <+38>: test %esi,%esi    0xf7ff66c8 <+40>: mov $0x1,%eax    0xf7ff66cd <+45>: cmove %eax,%esi    0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi    0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>    0xf7ff66d8 <+56>: mov %esi,%eax    0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>    0xf7ff66df <+63>: mov %eax,%esi    0xf7ff66e1 <+65>: test %esi,%esi    0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>    0xf7ff66e5 <+69>: mov %esi,%eax    0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx    0xf7ff66ea <+74>: mov -0x8(%ebp),%esi    0xf7ff66ed <+77>: mov -0x4(%ebp),%edi    0xf7ff66f0 <+80>: mov %ebp,%esp    0xf7ff66f2 <+82>: pop %ebp    0xf7ff66f3 <+83>: ret    0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>    0xf7ff66fd <+93>: test %al,%al    0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>    0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>    0xf7ff6706 <+102>: xor %esi,%esi    0xf7ff6708 <+104>: movl $0xc,(%eax)    0xf7ff670e <+110>: mov %esi,%eax    0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx    0xf7ff6713 <+115>: mov -0x8(%ebp),%esi    0xf7ff6716 <+118>: mov -0x4(%ebp),%edi    0xf7ff6719 <+121>: mov %ebp,%esp    0xf7ff671b <+123>: pop %ebp    0xf7ff671c <+124>: ret    0xf7ff671d <+125>: lea 0x0(%esi),%esi    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff6726 <+134>: nop    0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi    0xf7ff6737 <+151>: test %edi,%edi    0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>    0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi    0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>    0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi    0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>    0xf7ff6757 <+183>: sub $0x1,%esi    0xf7ff675a <+186>: mov %esi,%edx    0xf7ff675c <+188>: shr %edx    0xf7ff675e <+190>: or %esi,%edx    0xf7ff6760 <+192>: mov %edx,%eax    0xf7ff6762 <+194>: shr $0x2,%eax    0xf7ff6765 <+197>: or %edx,%eax    0xf7ff6767 <+199>: mov %eax,%edx    0xf7ff6769 <+201>: shr $0x4,%edx    0xf7ff676c <+204>: or %eax,%edx    0xf7ff676e <+206>: mov %edx,%eax    0xf7ff6770 <+208>: shr $0x8,%eax    0xf7ff6773 <+211>: or %edx,%eax    0xf7ff6775 <+213>: mov $0xffffffff,%edx    0xf7ff677a <+218>: mov %eax,%esi    0xf7ff677c <+220>: shr $0x10,%esi    0xf7ff677f <+223>: or %eax,%esi    0xf7ff6781 <+225>: add $0x1,%esi    0xf7ff6784 <+228>: mov %esi,%eax    0xf7ff6786 <+230>: shr $0x2,%eax    0xf7ff6789 <+233>: bsf %eax,%eax    0xf7ff678c <+236>: cmove %edx,%eax    0xf7ff678f <+239>: cmp $0x1,%esi    0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax    0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax    0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)    0xf7ff67a0 <+256>: mov $0x2,%eax    0xf7ff67a5 <+261>: cmova %esi,%eax    0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)    0xf7ff67ab <+267>: mov %edi,(%esp)    0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt> So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t). From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:    0xf7ff6720 <+128>: mov %gs:0x0,%eax    0xf7ff672b <+139>: lea 0x0,%edx    0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored: (gdb) print *0xf3a4faac $10 = −208347070 ...which is 0xf394e040 I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
2011-04-08 10:21:09 Matthias Klose bug watch added http://sourceware.org/bugzilla/show_bug.cgi?id=12654
2011-04-08 10:21:09 Matthias Klose bug task added binutils
2011-04-08 14:42:28 Colin Watson binutils (Ubuntu Natty): assignee Canonical Foundations Team (canonical-foundations) Matthias Klose (doko)
2011-04-09 11:50:12 Launchpad Janitor binutils (Ubuntu Natty): status Triaged Fix Released
2011-04-09 12:19:14 Launchpad Janitor branch linked lp:ubuntu/binutils
2011-04-11 00:25:11 Launchpad Janitor firefox (Ubuntu Natty): status Triaged Fix Released
2011-04-11 01:20:34 Launchpad Janitor branch linked lp:ubuntu/firefox
2011-04-11 03:31:14 Launchpad Janitor branch linked lp:debian/binutils
2011-05-25 19:24:00 Bug Watch Updater binutils: status Unknown Fix Released
2011-05-25 19:24:00 Bug Watch Updater binutils: importance Unknown Medium