2010-10-19 14:03:09 |
Chris Coulson |
bug |
|
|
added bug |
2010-10-19 14:12:36 |
Micah Gersten |
bug |
|
|
added subscriber Micah Gersten |
2010-10-19 15:07:19 |
John Vivirito |
gcc-4.5 (Ubuntu): status |
New |
Confirmed |
|
2010-10-19 18:15:11 |
Chris Coulson |
description |
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then passed to put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 |
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
|
2010-10-20 15:19:34 |
Chris Coulson |
bug |
|
|
added subscriber Kees Cook |
2010-10-20 15:19:38 |
Chris Coulson |
gcc-4.5 (Ubuntu): importance |
Undecided |
High |
|
2010-10-29 20:43:26 |
Reed Loden |
bug |
|
|
added subscriber Reed Loden |
2010-10-31 21:29:03 |
Philip Muškovac |
bug |
|
|
added subscriber Philip Muškovac |
2010-10-31 23:49:44 |
Fabien Tassin |
bug |
|
|
added subscriber Fabien Tassin |
2010-11-02 22:09:48 |
Chris Coulson |
description |
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas[0], which is a pointer to a struct arena_t (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas[0], which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf394e000 in to the edi register, which is where the valid pointer is currently stored
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as the arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:
(gdb) print *0xf3a4faac
$10 = −208347070
...which is 0xf394e040
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
|
2010-11-02 22:13:07 |
Chris Coulson |
description |
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as the arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:
(gdb) print *0xf3a4faac
$10 = −208347070
...which is 0xf394e040
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:
(gdb) print *0xf3a4faac
$10 = −208347070
...which is 0xf394e040
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
|
2010-11-02 22:19:56 |
Chris Coulson |
attachment added |
|
firefox-bin.gcc-snapshot https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721156/+files/firefox-bin.gcc-snapshot |
|
2010-11-02 22:48:16 |
Chris Coulson |
attachment added |
|
firefox-bin.no-pie-4.5 https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721185/+files/firefox-bin.no-pie-4.5 |
|
2010-11-02 23:04:43 |
Chris Coulson |
attachment added |
|
firefox-4.0-bin-4.4 https://bugs.launchpad.net/ubuntu/+source/gcc-4.5/+bug/663294/+attachment/1721188/+files/firefox-4.0-bin-4.4 |
|
2010-11-02 23:04:59 |
Chris Coulson |
bug |
|
|
added subscriber Matthias Klose |
2010-11-02 23:05:10 |
Chris Coulson |
summary |
Firefox built with gcc-4.5 is a non-starter on i386 |
Firefox built with gcc-4.5 is a non-starter on i386 with -pie |
|
2010-11-03 14:12:31 |
Launchpad Janitor |
branch linked |
|
lp:firefox |
|
2010-11-04 15:34:32 |
Launchpad Janitor |
branch linked |
|
lp:~mozillateam/xulrunner/xulrunner-2.0.head |
|
2010-11-05 11:21:53 |
Launchpad Janitor |
branch linked |
|
lp:firefox/3.6 |
|
2010-11-07 23:00:51 |
Launchpad Janitor |
branch linked |
|
lp:xulrunner/1.9.2 |
|
2010-11-16 00:14:28 |
Launchpad Janitor |
branch linked |
|
lp:firefox/4.0 |
|
2010-11-17 22:20:52 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/firefox |
|
2010-11-19 19:36:51 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/xulrunner-2.0 |
|
2010-11-22 21:12:40 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/xulrunner-1.9.2 |
|
2011-01-26 03:29:59 |
jhonatas matos |
gcc-4.5 (Ubuntu): status |
Confirmed |
Fix Committed |
|
2011-01-26 03:30:24 |
jhonatas matos |
branch unlinked |
lp:ubuntu/firefox |
|
|
2011-01-26 10:08:28 |
Chris Coulson |
gcc-4.5 (Ubuntu): status |
Fix Committed |
Triaged |
|
2011-02-15 00:25:38 |
Kees Cook |
nominated for series |
|
Ubuntu Natty |
|
2011-02-15 00:25:38 |
Kees Cook |
bug task added |
|
gcc-4.5 (Ubuntu Natty) |
|
2011-02-15 00:25:46 |
Kees Cook |
gcc-4.5 (Ubuntu Natty): milestone |
|
natty-alpha-3 |
|
2011-02-15 00:25:58 |
Kees Cook |
gcc-4.5 (Ubuntu Natty): assignee |
|
Canonical Desktop Team (canonical-desktop-team) |
|
2011-02-15 00:27:17 |
Kees Cook |
gcc-4.5 (Ubuntu Natty): assignee |
Canonical Desktop Team (canonical-desktop-team) |
Canonical Foundations Team (canonical-foundations) |
|
2011-02-15 00:30:38 |
Kees Cook |
bug task added |
|
firefox (Ubuntu) |
|
2011-02-15 00:31:00 |
Kees Cook |
firefox (Ubuntu Natty): status |
New |
Triaged |
|
2011-02-15 00:31:07 |
Kees Cook |
firefox (Ubuntu Natty): importance |
Undecided |
High |
|
2011-02-15 00:31:18 |
Kees Cook |
firefox (Ubuntu Natty): assignee |
|
Canonical Desktop Team (canonical-desktop-team) |
|
2011-02-15 00:31:21 |
Kees Cook |
firefox (Ubuntu Natty): milestone |
|
natty-alpha-3 |
|
2011-02-15 07:38:23 |
Martin Pitt |
firefox (Ubuntu Natty): assignee |
Canonical Desktop Team (canonical-desktop-team) |
Chris Coulson (chrisccoulson) |
|
2011-02-25 13:23:49 |
Martin Pitt |
gcc-4.5 (Ubuntu Natty): milestone |
natty-alpha-3 |
ubuntu-11.04-beta-1 |
|
2011-02-25 13:23:55 |
Martin Pitt |
firefox (Ubuntu Natty): milestone |
natty-alpha-3 |
ubuntu-11.04-beta-1 |
|
2011-02-25 13:25:11 |
Martin Pitt |
firefox (Ubuntu Natty): importance |
High |
Medium |
|
2011-02-25 13:25:51 |
Martin Pitt |
gcc-4.5 (Ubuntu Natty): importance |
High |
Medium |
|
2011-02-25 15:43:23 |
Marc Deslauriers |
bug |
|
|
added subscriber Marc Deslauriers |
2011-03-22 17:19:21 |
Martin Pitt |
bug |
|
|
added subscriber Martin Pitt |
2011-03-22 18:36:40 |
Micah Gersten |
tags |
|
regression-release |
|
2011-03-22 18:36:48 |
Micah Gersten |
tags |
regression-release |
natty regression-release |
|
2011-04-04 16:55:12 |
Kate Stewart |
firefox (Ubuntu Natty): milestone |
ubuntu-11.04-beta-1 |
ubuntu-11.04-beta-2 |
|
2011-04-04 16:55:16 |
Kate Stewart |
gcc-4.5 (Ubuntu Natty): milestone |
ubuntu-11.04-beta-1 |
ubuntu-11.04-beta-2 |
|
2011-04-07 12:01:35 |
Chris Coulson |
affects |
gcc-4.5 (Ubuntu Natty) |
binutils (Ubuntu Natty) |
|
2011-04-07 12:07:37 |
Chris Coulson |
attachment added |
|
Output of readelf -a for test.o with local-dynamic TLS https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993653/+files/readelf.local-dynamic |
|
2011-04-07 12:08:49 |
Chris Coulson |
attachment added |
|
Output of readelf -a for test.o with global-dynamic TLS https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993654/+files/readelf.global-dynamic |
|
2011-04-07 12:10:36 |
Chris Coulson |
attachment added |
|
jemalloc.o.info.tar.gz https://bugs.launchpad.net/ubuntu/natty/+source/binutils/+bug/663294/+attachment/1993667/+files/jemalloc.o.info.tar.gz |
|
2011-04-07 13:33:31 |
Martin Pitt |
description |
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:
(gdb) print *0xf3a4faac
$10 = −208347070
...which is 0xf394e040
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4
|
NOTE: See comment 17 for a test case, and following comments for analysis.
Binary package hint: gcc-4.5
I noticed this first last week when I tried to do a profile-guided build of Firefox with gcc-4.5 in maverick, and the build was hanging in my PPA. However, I've now recreated this locally on a normal build with gcc-4.5 (in maverick and natty) - on i386 only.
It is hanging here:
#0 0xffffe405 in __kernel_vsyscall ()
#1 0xf7fb1169 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0xf7fac5cb in _L_lock_748 () from /lib/libpthread.so.0
#3 0xf7fac3f1 in __pthread_mutex_lock (mutex=0xf3a4fab0) at pthread_mutex_lock.c:61
#4 0xf7ff67b3 in malloc_mutex_lock (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:1417
#5 arena_malloc_small (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3766
#6 arena_malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3845
#7 imalloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:3857
#8 malloc (size=<value optimized out>) at /home/chrisccoulson/src/firefox-4.0/firefox-4.0-4.0~b8~hg20101012r55334+nobinonly/build-tree/mozilla/memory/jemalloc/jemalloc.c:5873
#9 0xf40bf11f in __fopen_internal (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r", is32=0) at iofopen.c:76
#10 0xf40c17bc in _IO_fopen64 (filename=0xf3b49522 "/proc/filesystems", mode=0xf3b49417 "r") at iofopen64.c:39
#11 0xf3b3f3ca in ?? () from /lib/libselinux.so.1
#12 0xf3b48f0d in ?? () from /lib/libselinux.so.1
#13 0xf3b37158 in _init () from /lib/libselinux.so.1
#14 0xf7fdf70c in call_init (l=<value optimized out>, argc=<value optimized out>, argv=0xffffd5e4, env=0xffffd5ec) at dl-init.c:70
#15 0xf7fdf829 in _dl_init (main_map=0xf7fee918, argc=<value optimized out>, argv=<value optimized out>, env=0xffffd5ec) at dl-init.c:134
#16 0xf7fd188f in _dl_start_user () from /lib/ld-linux.so.2
...which is happening even before entering main()
If I break in pthread_mutex_lock, and look at the contents of the mutex, I see:
$61 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
...which is clearly full of garbage. Initially, I thought that perhaps the mutex wasn't being initialized - but it is. The interesting bit is happening here, in imalloc():
static inline void *
imalloc(size_t size)
{
assert(size != 0);
if (size <= arena_maxclass)
return (arena_malloc(choose_arena(), size, false));
else
return (huge_malloc(size, false));
Note that in this case, size < arena_maxclass.
choose_arena() is an inline that returns arenas_map, which is a pointer stored in TLS to a struct arena_t, and was previously initialized with the contents of arenas[0] (note that firefox is using only a single arena). If I break inside choose_arena(), and inspect some values, I see:
(gdb) print arenas[0]
$3 = (arena_t *) 0xf394e040
(gdb) print &arenas[0]
$4 = (arena_t **) 0xf394e000
(gdb) print arenas[0]->lock
$5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 3, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 12 times>, "\003\000\000\000\000\000\000\000\000\000\000",
__align = 0}
(gdb) print arenas_map
$6 = (arena_t *) 0xf394e040
(gdb) print &arenas_map
$7 = (arena_t **) 0xf3a4faac
The mutex looks correctly initialized here, and is unlocked.
If I step through a few more instructions in to arena_malloc(), then the value of arena (which was returned by choose_arena), is something else entirely:
(gdb) print arena
$6 = (arena_t *) 0xf3a4fab0
...this should be the same as arenas_map, which is what the C code suggests was returned by choose_arena and passed to arena_malloc
Also:
(gdb) print arena->lock
$8 = {__data = {__lock = -207291728, __count = 4087676744, __owner = -207291728, __kind = 0, __nusers = 4294960128, {__spins = 36645888, __list = {__next = 0x22f2c00}}},
__size = "\260\372\244\363H\377\244\363\260\372\244\363\000\000\000\000\000\344\377\377\000,/\002", __align = -207291728}
So, what is happening here? If I look at the registers, I see that eax and edi both contain this bogus pointer;
(gdb) info registers
eax 0xf3a4fab0 -207291728
ecx 0x3 3
edx 0x0 0
ebx 0xf7ffeebc -134222148
esp 0xffffd3d4 0xffffd3d4
ebp 0xffffd42c 0xffffd42c
esi 0x160 352
edi 0xf3a4fab0 -207291728
eip 0xf7ff673f 0xf7ff673f <malloc+159>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
Here is the interesting part of malloc disassembled:
0xf7ff66a0 <+0>: push %ebp
0xf7ff66a1 <+1>: mov %esp,%ebp
0xf7ff66a3 <+3>: sub $0x58,%esp
0xf7ff66a6 <+6>: mov %ebx,-0xc(%ebp)
0xf7ff66a9 <+9>: call 0xf7ff0917 <__i686.get_pc_thunk.bx>
0xf7ff66ae <+14>: add $0x880e,%ebx
0xf7ff66b4 <+20>: mov %esi,-0x8(%ebp)
0xf7ff66b7 <+23>: mov 0x8(%ebp),%esi
0xf7ff66ba <+26>: mov %edi,-0x4(%ebp)
0xf7ff66bd <+29>: cmpb $0x0,0x151c(%ebx)
0xf7ff66c4 <+36>: je 0xf7ff66f8 <malloc+88>
0xf7ff66c6 <+38>: test %esi,%esi
0xf7ff66c8 <+40>: mov $0x1,%eax
0xf7ff66cd <+45>: cmove %eax,%esi
0xf7ff66d0 <+48>: cmp 0x480(%ebx),%esi
0xf7ff66d6 <+54>: jbe 0xf7ff6720 <malloc+128>
0xf7ff66d8 <+56>: mov %esi,%eax
0xf7ff66da <+58>: call 0xf7ff5980 <huge_malloc>
0xf7ff66df <+63>: mov %eax,%esi
0xf7ff66e1 <+65>: test %esi,%esi
0xf7ff66e3 <+67>: je 0xf7ff6701 <malloc+97>
0xf7ff66e5 <+69>: mov %esi,%eax
0xf7ff66e7 <+71>: mov -0xc(%ebp),%ebx
0xf7ff66ea <+74>: mov -0x8(%ebp),%esi
0xf7ff66ed <+77>: mov -0x4(%ebp),%edi
0xf7ff66f0 <+80>: mov %ebp,%esp
0xf7ff66f2 <+82>: pop %ebp
0xf7ff66f3 <+83>: ret
0xf7ff66f4 <+84>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff66f8 <+88>: call 0xf7ff45c0 <malloc_init_hard>
0xf7ff66fd <+93>: test %al,%al
0xf7ff66ff <+95>: je 0xf7ff66c6 <malloc+38>
0xf7ff6701 <+97>: call 0xf7ff0530 <__errno_location@plt>
0xf7ff6706 <+102>: xor %esi,%esi
0xf7ff6708 <+104>: movl $0xc,(%eax)
0xf7ff670e <+110>: mov %esi,%eax
0xf7ff6710 <+112>: mov -0xc(%ebp),%ebx
0xf7ff6713 <+115>: mov -0x8(%ebp),%esi
0xf7ff6716 <+118>: mov -0x4(%ebp),%edi
0xf7ff6719 <+121>: mov %ebp,%esp
0xf7ff671b <+123>: pop %ebp
0xf7ff671c <+124>: ret
0xf7ff671d <+125>: lea 0x0(%esi),%esi
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff6726 <+134>: nop
0xf7ff6727 <+135>: lea 0x0(%esi,%eiz,1),%esi
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6731 <+145>: mov %edx,-0x1c(%ebp)
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
0xf7ff6737 <+151>: test %edi,%edi
0xf7ff6739 <+153>: je 0xf7ff6960 <malloc+704>
0xf7ff673f <+159>: cmp 0x14f8(%ebx),%esi
0xf7ff6745 <+165>: ja 0xf7ff68a0 <malloc+512>
0xf7ff674b <+171>: cmp 0x14f0(%ebx),%esi
0xf7ff6751 <+177>: jae 0xf7ff6858 <malloc+440>
0xf7ff6757 <+183>: sub $0x1,%esi
0xf7ff675a <+186>: mov %esi,%edx
0xf7ff675c <+188>: shr %edx
0xf7ff675e <+190>: or %esi,%edx
0xf7ff6760 <+192>: mov %edx,%eax
0xf7ff6762 <+194>: shr $0x2,%eax
0xf7ff6765 <+197>: or %edx,%eax
0xf7ff6767 <+199>: mov %eax,%edx
0xf7ff6769 <+201>: shr $0x4,%edx
0xf7ff676c <+204>: or %eax,%edx
0xf7ff676e <+206>: mov %edx,%eax
0xf7ff6770 <+208>: shr $0x8,%eax
0xf7ff6773 <+211>: or %edx,%eax
0xf7ff6775 <+213>: mov $0xffffffff,%edx
0xf7ff677a <+218>: mov %eax,%esi
0xf7ff677c <+220>: shr $0x10,%esi
0xf7ff677f <+223>: or %eax,%esi
0xf7ff6781 <+225>: add $0x1,%esi
0xf7ff6784 <+228>: mov %esi,%eax
0xf7ff6786 <+230>: shr $0x2,%eax
0xf7ff6789 <+233>: bsf %eax,%eax
0xf7ff678c <+236>: cmove %edx,%eax
0xf7ff678f <+239>: cmp $0x1,%esi
0xf7ff6792 <+242>: lea 0x9(%eax,%eax,8),%eax
0xf7ff6796 <+246>: lea 0x94(%edi,%eax,8),%eax
0xf7ff679d <+253>: mov %eax,-0x1c(%ebp)
0xf7ff67a0 <+256>: mov $0x2,%eax
0xf7ff67a5 <+261>: cmova %esi,%eax
0xf7ff67a8 <+264>: mov %eax,-0x20(%ebp)
0xf7ff67ab <+267>: mov %edi,(%esp)
0xf7ff67ae <+270>: call 0xf7ff0620 <pthread_mutex_lock@plt>
So, the contents of edi (which seems to contain our pointer to the bogus arena_t) is what is passed to pthread_mutex_lock (which makes sense - the lock is the first entry in struct arena_t).
From 0xf7ff6720 <+128> is happening inside choose_arena(). I'm not sure what the following instruction sequence is doing, but it's loading the wrong value in to edi, which is then put on the stack and passed to pthread_mutex_lock:
0xf7ff6720 <+128>: mov %gs:0x0,%eax
0xf7ff672b <+139>: lea 0x0,%edx
0xf7ff6734 <+148>: mov (%edx,%eax,1),%edi
I would have thought that this sequence would be loading the contents of 0xf3a4faac in to the edi register, which is where the valid pointer is currently stored:
(gdb) print *0xf3a4faac
$10 = −208347070
...which is 0xf394e040
I'm not sure what the problem is, but I can't see anything obviously wrong with jemalloc, and this worked just fine in gcc-4.4 |
|
2011-04-08 10:21:09 |
Matthias Klose |
bug watch added |
|
http://sourceware.org/bugzilla/show_bug.cgi?id=12654 |
|
2011-04-08 10:21:09 |
Matthias Klose |
bug task added |
|
binutils |
|
2011-04-08 14:42:28 |
Colin Watson |
binutils (Ubuntu Natty): assignee |
Canonical Foundations Team (canonical-foundations) |
Matthias Klose (doko) |
|
2011-04-09 11:50:12 |
Launchpad Janitor |
binutils (Ubuntu Natty): status |
Triaged |
Fix Released |
|
2011-04-09 12:19:14 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/binutils |
|
2011-04-11 00:25:11 |
Launchpad Janitor |
firefox (Ubuntu Natty): status |
Triaged |
Fix Released |
|
2011-04-11 01:20:34 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/firefox |
|
2011-04-11 03:31:14 |
Launchpad Janitor |
branch linked |
|
lp:debian/binutils |
|
2011-05-25 19:24:00 |
Bug Watch Updater |
binutils: status |
Unknown |
Fix Released |
|
2011-05-25 19:24:00 |
Bug Watch Updater |
binutils: importance |
Unknown |
Medium |
|