segv when certain packages install using qemu-static-arm

Bug #816791 reported by Tom Gall
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro QEMU
Fix Released
High
Unassigned

Bug Description

When qemu is used as part of live-build-3, when building the LEB image cross, a number of the mono packages (and a few others) will cause qemu-static-arm to segv as part of the configuration staep for that package as it is being installed inside of an arm chroot.

Replication of this for testing purposes is a tad difficult so I'll likely need to walk you through and supply you with the chroot. (Which is quite large)

Revision history for this message
Tom Gall (tom-gall) wrote :
summary: - segv when certain packages install via qemu-static-arm
+ segv when certain packages install using qemu-static-arm
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Do you have at least some package names so we can try to reproduce it with some easier tool, like rootstock?

As rootstock is also using qemu-static-arm, it should be quite easy to reproduce with it, calling like:
rootstock --dist natty --fqdn panda-natty --login ubuntu --password ubuntu --serial ttyO2 --components "main universe multiverse" -s <package>

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Sorry, didn't see the log file, let me try with the mono packages.

Can you say which version of qemu-linaro you're using?

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Also got the same issue with rootstock: http://paste.ubuntu.com/652868/

Package: qemu-user-static 0.14.50-2011.06-0-0ubuntu1~ppa11.04.1
Rootstock command line: rootstock --dist natty --fqdn panda-natty --login ubuntu --password ubuntu --serial ttyO2 --components "main universe multiverse" -s banshee

Using natty as host.

Changed in qemu-linaro:
status: New → Confirmed
Revision history for this message
Peter Maydell (pmaydell) wrote :

> qemu: fatal: cp15 insn ee075fba

This is the (deprecated in ARMv7) ARMv6 DMB-via-cp15-op. Coincidentally I just posted a patch for this to qemu-devel last week:
http://patchwork.ozlabs.org/patch/106109/

So I'll put that into qemu-linaro for next month.

Incidentally, since it is deprecated, it's a minor bug in mono that it doesn't use the proper DMB instruction when it's compiled for and generating code for ARMv7.

Changed in qemu-linaro:
milestone: none → 2011.08
importance: Undecided → High
Revision history for this message
Peter Maydell (pmaydell) wrote :

Ricardo: your rootstock command doesn't work for me (natty host): rootstock complains:
 rootstock: invalid option -- 's'

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Sorry, please try the following:
sudo rootstock --dist natty --fqdn panda-natty --login ubuntu --password ubuntu --serial ttyO2 --components "main universe multiverse" --seed banshee

The -s issue is just fixed upstream.

Revision history for this message
Peter Maydell (pmaydell) wrote :

That patch is a variant on the upstream one which applies to qemu-linaro. It fixes the segfault due to the cp15 problems. However mono still segfaults in a different way further on:

===begin===
chroot tmpmount /usr/bin/qemu-arm-static /usr/bin/mono /usr/share/mono/MonoGetAssemblyName.exe /usr/lib/cli/gconf-sharp-2.0/gconf-sharp.dll -g

** (/usr/share/mono/MonoGetAssemblyName.exe:6868): WARNING **: Thread (nil) may have been prematurely finalized

Native stacktrace:

Debug info from gdb:

qemu: Unsupported syscall: 26
ptrace: Function not implemented.

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted
===endit===

This appears to be because the guest code is dereferencing a NULL pointer. (The bit about ptrace being unimplemented is because qemu doesn't implement the ptrace syscall, but that is just mono trying to be helpful and display a backtrace rather than the actual cause of the segfault.)

Revision history for this message
Peter Maydell (pmaydell) wrote :

Segfault backtrace collected by attaching a cross gdb to qemu's gdb stub:

#0 GC_install_header (h=<value optimized out>) at headers.c:213
#1 0x00120f6c in GC_get_first_part (h=0x40c59000, hhdr=<value optimized out>, bytes=4096, index=<value optimized out>)
    at allchblk.c:472
#2 0x001211d0 in GC_allochblk_nth (sz=24, kind=4, flags=0 '\000', n=4) at allchblk.c:736
#3 0x0012147c in GC_allochblk (sz=24, kind=4, flags=0) at allchblk.c:561
#4 0x00133518 in GC_new_hblk (sz=24, kind=4) at new_hblk.c:253
#5 0x0012396a in GC_allocobj (sz=24, kind=4) at alloc.c:1116
#6 0x00123b9e in GC_generic_malloc_inner (lb=88, k=4) at malloc.c:136
#7 0x0012430a in GC_gcj_malloc (lb=88, ptr_to_struct_containing_descr=0x20716c) at gcj_mlc.c:157
#8 0x000a110c in mono_object_allocate_spec (vtable=0x20716c) at object.c:3873
#9 mono_object_new_alloc_specific (vtable=0x20716c) at object.c:3950
#10 0x000a11b6 in mono_object_new_specific (vtable=0x20716c) at object.c:3939
#11 0x000a6936 in mono_runtime_init (domain=0x40c51e70, start_cb=0x1a2f9 <mono_thread_start_cb>,
    attach_cb=0x1a329 <mono_thread_attach_cb>) at appdomain.c:240
#12 0x0001e498 in mini_init (filename=<value optimized out>, runtime_version=<value optimized out>) at mini.c:5520
#13 0x0005a388 in mono_main (argc=3, argv=<value optimized out>) at driver.c:1635
#14 0x4099b622 in ?? ()

Revision history for this message
Peter Maydell (pmaydell) wrote :

Further investigation: this looks like qemu's usual poor handling of mmap() coupled with a libgc bug where it doesn't handle mmap failure well. Here's an excerpt from a qemu-strace:
7161 gettimeofday(1082130672,0,1956304,2246648,0,2246328) = 0
7161 brk(0x00265000) = 0x00265000
7161 mmap2(0x40c5d000,65536,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0xfffffff4
7161 mmap2(0x40c5d000,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x40c5d000
7161 mmap2(0x40c5e000,65536,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0xfffffff4
7161 mmap2(0x40c5e000,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x40c5e000
7161 mmap2(0x40c5f000,65536,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0xfffffff4
7161 mmap2(0x40c5f000,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0xfffffff4

...from just before we get the segfault.

The libgc bug is in GC_install_header():

/* Install a header for block h. */
/* The header is uninitialized. */
/* Returns the header or 0 on failure. */
struct hblkhdr * GC_install_header(h)
register struct hblk * h;
{
    hdr * result;

    if (!get_index((word) h)) return(0);
    result = alloc_hdr();
    SET_HDR(h, result);
# ifdef USE_MUNMAP
        result -> hb_last_reclaimed = GC_gc_no;
# endif
    return(result);
}

Note that we claim we might return 0 on failure (ie because alloc_hdr() returned NULL) but in fact if USE_MUNMAP is defined we will segfault if result is NULL.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Just to try to tidy up this bug a little:
 Bug 806783 is for qemu's poor mmap handling
 Bug 530000 is a different reason why mono under qemu might not work (libgc reading addresses from /proc)
 Bug 816945 I have just reported against mono for not handling mmap() failure nicely

...leaving this bug for the qemu cp15 barrier insn problem.

Peter Maydell (pmaydell)
Changed in qemu-linaro:
status: Confirmed → Fix Committed
Peter Maydell (pmaydell)
Changed in qemu-linaro:
status: Fix Committed → Fix Released
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

This is a bit from the peanut gallery, but you don't mean bug 806783 in your comment above ... (that's a bug in software-center (Ubuntu))

Revision history for this message
Peter Maydell (pmaydell) wrote :

Whoops, yes. Thanks for catching that now rather than in six months' time when I would have completely forgotten which bug I did mean :-)

 Bug 806873 is for qemu's poor mmap handling

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.