Unable to generate linaro images using qemu-linaro

Bug #947888 reported by Fathi Boudra
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro QEMU
Fix Released
Undecided
Unassigned

Bug Description

I'm using qemu-user-static with live build to do cross image creation:
https://ci.linaro.org/jenkins/view/Ubuntu%20Build%20Service/

When using Debian qemu-user-static 1.0+dfsg-3, the images are created successfully.

When using Ubuntu qemu-user-static 1.0.50-2012.02-0ubuntu1, it fails:
[...]
I: Configuring initramfs-tools...
W: Failure while configuring base packages. This will be re-attempted up to five times.
I: Configuring libssl1.0.0...
*** %n in writable segment detected ***
W: Failure while configuring base packages. This will be re-attempted up to five times.
*** %n in writable segment detected ***
W: Failure while configuring base packages. This will be re-attempted up to five times.
*** %n in writable segment detected ***
W: Failure while configuring base packages. This will be re-attempted up to five times.
*** %n in writable segment detected ***
W: Failure while configuring base packages. This will be re-attempted up to five times.
I: Base system installed successfully.
[2012-03-06 12:59:52] lb_bootstrap_cache save
P: Saving bootstrap stage to cache...
P: Begin unmounting filesystems...
P: Saving caches...
[2012-03-06 12:59:53] lb_chroot
P: Setting up cleanup function
[2012-03-06 12:59:53] lb_chroot_cache restore
[2012-03-06 12:59:53] lb_chroot_devpts install
[2012-03-06 12:59:53] lb_testroot
P: Begin mounting /dev/pts...
[2012-03-06 12:59:53] lb_chroot_proc install
[2012-03-06 12:59:53] lb_testroot
P: Begin mounting /proc...
[2012-03-06 12:59:53] lb_chroot_selinuxfs install
[2012-03-06 12:59:54] lb_testroot
[2012-03-06 12:59:54] lb_chroot_sysfs install
[2012-03-06 12:59:54] lb_testroot
P: Begin mounting /sys...
[2012-03-06 12:59:54] lb_chroot_debianchroot install
P: Configuring file /etc/debian_chroot
[2012-03-06 12:59:54] lb_chroot_dpkg install
P: Configuring file /sbin/start-stop-daemon
[2012-03-06 12:59:54] lb_chroot_tmpfs install
[2012-03-06 12:59:54] lb_chroot_sysv-rc install
P: Configuring file /usr/sbin/policy-rc.d
[2012-03-06 12:59:54] lb_chroot_upstart install
P: Configuring file /sbin/initctl
[2012-03-06 12:59:55] lb_chroot_hosts install
P: Configuring file /etc/hosts
[2012-03-06 12:59:55] lb_chroot_resolv install
P: Configuring file /etc/resolv.conf
[2012-03-06 12:59:55] lb_chroot_hostname install
P: Configuring file /etc/hostname
P: Configuring file /bin/hostname
[2012-03-06 12:59:55] lb_chroot_apt install
P: Configuring file /etc/apt/apt.conf
[2012-03-06 12:59:55] lb_chroot_archives chroot install
P: Configuring file /etc/apt/sources.list
*** %n in writable segment detected ***
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)
P: Begin unmounting filesystems...
P: Saving caches...
Command exited with non-zero status 134

I don't have a reduced test case but the setup can be easily reproduced.
If you give me some instructions to get more information, I'll happily try.

Revision history for this message
Steve Langasek (vorlon) wrote :

This abort comes from glibc. It's possible that the newer glibc found in Ubuntu precise introduces additional asserts that were not an issue previously. If you use one of the qemu-arm-static builds from the linaro-maintainers tools ppa build for an older release, is the problem reproducible?

Revision history for this message
Steve Langasek (vorlon) wrote :

Also, you may want to check what results you get with the non-static qemu-arm builds - both the Debian one, and the oneiric/ppa one. This can help us narrow down whether this is a difference in the precise glibc, or in the precise build flags.

Revision history for this message
Fathi Boudra (fboudra) wrote : Re: [Bug 947888] Re: Unable to generate linaro images using qemu-linaro

On 7 March 2012 06:06, Steve Langasek wrote:
> This abort comes from glibc.  It's possible that the newer glibc found
> in Ubuntu precise introduces additional asserts that were not an issue
> previously.  If you use one of the qemu-arm-static builds from the
> linaro-maintainers tools ppa build for an older release, is the problem
> reproducible?

Image creation tested: Nano (Precise/armhf)
Host: Ubuntu Precise (up-to-date, 64bits)
OK = image created successfully
NOK = qemu: uncaught target signal 6 (Aborted) - core dumped

Results:
qemu-user-static_1.0.1+dfsg-1_amd64.deb -> OK
qemu-user-static_1.0.50-2012.02-0ubuntu1_amd64.deb -> NOK
qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa10.04.1_amd64.deb -> NOK
qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa10.10.1_amd64.deb -> NOK
qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa11.04.1_amd64.deb -> NOK
qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa11.10.1_amd64.deb -> NOK

Revision history for this message
Steve Langasek (vorlon) wrote :

On Wed, Mar 07, 2012 at 09:11:39AM -0000, Fathi Boudra wrote:
> Image creation tested: Nano (Precise/armhf)
> Host: Ubuntu Precise (up-to-date, 64bits)
> OK = image created successfully
> NOK = qemu: uncaught target signal 6 (Aborted) - core dumped

> Results:
> qemu-user-static_1.0.1+dfsg-1_amd64.deb -> OK
> qemu-user-static_1.0.50-2012.02-0ubuntu1_amd64.deb -> NOK
> qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa10.04.1_amd64.deb -> NOK
> qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa10.10.1_amd64.deb -> NOK
> qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa11.04.1_amd64.deb -> NOK
> qemu-user-static_1.0.50-2012.02-0ubuntu1~ppa11.10.1_amd64.deb -> NOK

Ah, so it's pretty clearly something specific to the qemu-linaro (source or
packaging), and not to the precise build environment.

I guess this was working until recently, right? Do you know what the last
version of the qemu-linaro package was that worked correctly? For instance,
the last upload before 1.0.50-2012.02-0ubuntu1, 1.0.50-2012.01-0ubuntu4,
introduced a behavior change to fix bug #906922 that's known to have a risk
of regression. So I think the two most likely causes for this problem are
that patch, or something new in the 2012.02 upstream version.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Can you provide a command line for reproducing this on a local box, please?

Revision history for this message
Fathi Boudra (fboudra) wrote :

On 9 March 2012 12:30, Peter Maydell wrote:
> Can you provide a command line for reproducing this on a local box,
> please?

$ git clone git://git.linaro.org/people/fboudra/ubuntu-build-service .
$ sudo apt-get install qemu-user-static debootstrap
$ sudo dpkg -i --force-all live-build_3.0~a45-1_all.deb
$ cd precise-armhf-nano
$ ./configure
$ make

Revision history for this message
Peter Maydell (pmaydell) wrote :

I'm considering applying http://patchwork.ozlabs.org/patch/144476/ which fixes an issue which is exposed by the fix added to the ubuntu packages for bug #906922, so it would be interesting to see if that fixes this problem too.

Revision history for this message
Peter Maydell (pmaydell) wrote :

OK, I've reproduced this and that patch 144476 doesn't fix it, so it's something else. I'll prod it a bit more closely.

Revision history for this message
Fathi Boudra (fboudra) wrote :

On 9 March 2012 13:08, Peter Maydell wrote:
> I'm considering applying http://patchwork.ozlabs.org/patch/144476/ which
> fixes an issue which is exposed by the fix added to the ubuntu packages
> for bug #906922, so it would be interesting to see if that fixes this
> problem too.

I've rebuilt qemu-user-static with the following changes:
* drop 0001_linux-user-reserve-4GB-of-vmem-for-32-on-64
* add linux-user-resolve-reserved_va-vma-downwards.patch

=> Nano image built successfully.

Revision history for this message
Peter Maydell (pmaydell) wrote :

If you drop the reserve-4GB patch then the case which the second patch is fixing will not be triggered, so there's no point in applying it. Useful to confirm that it is the reserve-4GB patch that is causing the regression, though.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Interim report: we're failing the glibc format string check for this call to snprintf:

#3 0x0007c442 in snprintf (__fmt=0xa8da8 "%.*s/.#lk%p.%n%s.%d", __n=<optimized out>, __s=0xc0010 "")
    at /usr/include/arm-linux-gnueabihf/bits/stdio2.h:65
#4 create_dotlock (file_to_lock=0xb7314 "/etc/apt/trusted.gpg") at ../../util/dotlock.c:173
#5 0x00017096 in keyring_lock (hd=<optimized out>, yes=<optimized out>) at ../../g10/keyring.c:305
#6 0x00015eaa in lock_all (hd=0xbfbd0) at ../../g10/keydb.c:446
#7 0x000164bc in keydb_insert_keyblock (hd=0xbfbd0, kb=0xbdbd8) at ../../g10/keydb.c:576
#8 0x00046ec2 in import_one (keyblock=0xbdbd8, stats=0xb9568, fpr=0x0, fpr_len=0x0, options=8,
    from_sk=0, fname=<optimized out>) at ../../g10/import.c:843
#9 0x00047de6 in import (inp=0xb95b0, fname=0xf6fff813 "/tmp/export", stats=<optimized out>,
    fpr=0x0, fpr_len=0x0, options=8) at ../../g10/import.c:258
#10 0x000484e6 in import_keys_internal (inp=<optimized out>, fnames=0xf6fff650,
    nnames=<optimized out>, stats_handle=0x0, fpr=0x0, fpr_len=0x0, options=8)
    at ../../g10/import.c:196
#11 0x000485ca in import_keys (fnames=0xf6fff650, nnames=1, stats_handle=0x0,
    options=<optimized out>) at ../../g10/import.c:229
#12 0x0000fafc in main (argc=1, argv=0xf6fff650) at ../../g10/gpg.c:3648

The format string is at 0xa8da8, which is in the binary. On this run QEMU's guest_base is 0x7ffefdbfd000 so that's host address 0x7ffefdca5da8, which is in this map:
7ffefdc05000-7ffefdca7000 r-xp 00000000 fc:01 15603147 /home/petmay01/ubuntu-build-service/precise-armhf-nano/chroot/usr/bin/gpg

which is indeed non-writable. So the problem is that glibc is incorrectly thinking that it is a writable section of memory, presumably because we've confused it somehow.

Revision history for this message
Peter Maydell (pmaydell) wrote :

> we've confused it somehow

...it opens and parses /proc/self/maps the first time you use a %n in a format string. We used to not implement emulation of /maps in QEMU, which causes libc to disable its "is this read-only?" check, which is one reason this used not to be a problem. We now do have an emulated /maps but it's presumably not entirely correct.

Revision history for this message
Peter Maydell (pmaydell) wrote :

It looks like glibc's map parsing:
 http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/sysdeps/unix/sysv/linux/readonly-area.c?view=annotate
insists that the whole of the area being checked is explicitly covered by an entry in /maps which marks it as read-only -- areas not covered by a map line at all are assumed to be writable.

QEMU's implementation (as it stands in qemu-linaro and master) has a single entry corresponding to the stack area. So we'll never pass this and %n is always broken.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Alex has some patches which improve this, they just haven't hit master yet. We should grab them.

Revision history for this message
Peter Maydell (pmaydell) wrote :
Revision history for this message
Peter Maydell (pmaydell) wrote :

I've committed a set of patches from Alex which fix the mmap emulation and confirmed that the resulting qemu can successfully do a live-build of a nano image. These changes will be in 2012.03, due out later this week.

Changed in qemu-linaro:
milestone: none → 2012.03
status: New → Fix Committed
Peter Maydell (pmaydell)
Changed in qemu-linaro:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.