sbrk() not working under qemu-user with a PIE-compiled binary?

Bug #1749393 reported by Raphaël Hertzog
102
This bug affects 18 people
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned
qemu (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Christian Ehrhardt 

Bug Description

[Impact]

 * The current space reserved can be too small and we can end up
   with no space at all for BRK. It can happen to any case, but is
   much more likely with the now common PIE binaries.

 * Backport the upstream fix which reserves a bit more space while loading
   and giving it back after interpreter and stack is loaded.

[Test Plan]

 * On x86 run:
sudo apt install -y qemu-user-static docker.io
sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt install -y wget'
...
Running hooks in /etc/ca-certificates/update.d...
done.
Errors were encountered while processing:
 libc-bin
E: Sub-process /usr/bin/dpkg returned an error code (1)

Second test from bug 1928075

$ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64 http://ftp.debian.org/debian

In the bad case this is failing like
W: Failure trying to run: /sbin/ldconfig
W: See //debootstrap/debootstrap.log for detail

And in that log file you'll see the segfault
$ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

[Where problems could occur]

 * Regressions would be around use-cases of linux-user that is
   emulation not of a system but of binaries.
   Commonly uses for cross-tests and cross-builds so that is the
   space to watch for regressions

[Other Info]

 * n/a

---

In Debian unstable, we recently switched bash to be a PIE-compiled binary (for hardening). Unfortunately this resulted in bash being broken when run under qemu-user (for all target architectures, host being amd64 for me).

$ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

bash has its own malloc implementation based on sbrk():
https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

When we disable this internal implementation and rely on glibc's malloc, then everything is fine. But it might be that glibc has a fallback when sbrk() is not working properly and it might hide the underlying problem in qemu-user.

This issue has also been reported to the bash upstream author and he suggested that the issue might be in qemu-user so I'm opening a ticket here. Here's the discussion with the bash upstream author:
https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

You can find the problematic bash binary in that .deb file:
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

The version of qemu I have been using is 2.11 (Debian package qemu-user-static version 1:2.11+dfsg-1) but I have had reports that the problem is reproducible with older versions (back to 2.8 at least).

Here are the related Debian bug reports:
https://bugs.debian.org/889869
https://bugs.debian.org/865599

It's worth noting that bash used to have this problem (when compiled as a PIE binary) even when run directly but then something got fixed in the kernel and now the problem only appears when run under qemu-user:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

Related branches

Revision history for this message
Gérard Vidal (gerard-f-vidal-4) wrote :

Affected by the same bug.
target architecture arm
host architecture amd64

bug message
bash: xmalloc: .././locale.c:****: cannot allocate 2 bytes (0 bytes allocated)

Revision history for this message
Peter Ogden (peterogden) wrote :

This appears to be a problem in all PIE-compiled executables that use sbrk in qemu-user due to the way that position-independent code gets mmapped into adjacent ranges meaning there is no room for expansion. I've hacked my version of QEMU to force the program binary to mmap in a different range allowing for the region to be resized which fixes this issue. I don't know the most appropriate way to determine what range to use in generate though.

Peter Maydell (pmaydell)
tags: added: arm linux-user
Revision history for this message
Peter Maydell (pmaydell) wrote :

There seem to be two parts to this. Firstly, with a big reserved-region, which is the default for 32-bit-guest-on-64-bit-host, this code in main.c:

        if (reserved_va) {
            mmap_next_start = reserved_va;
        }

says to start trying for the next mmap address at the top of the reserved section, which is typically right at the top of the guest's address space. This means that for a PIE executable we'll try to load it at a very high address, which then means there's no space above the data section for the brk segment.

Secondly, for the no-reserved-region case (-R 0, or 64-on-64), we still fail, but this time because we've chosen to mmap the dynamic interpreter at an address just above the executable. Again, no space to expand the data segment and brk fails.

Linux kernel commit a87938b2e246b81 message says something about there being a guaranteed 128MB "gap" between data segment and stack on x86-64 which we're obviously not honourin; presumbably there's similar requirements for other archs. (As an aside, is bash really happy with only having perhaps 128MB of allocatable memory? Otherwise it really ought to use mmap rather than brk for its allocator.)

Revision history for this message
Peter Ogden (peterogden) wrote :

Could we over-allocate the data segment by QEMU_DATA_SIZE/getrlimit(RLIMIT_DATA)/128 MB depending on what's set - similar to how the stack size is managed?

My current workaround for aarch64 on x86-64 is to mmap a dynamic main executable in some far-away part of the address space but I'm not sure how to find somewhere suitable on a 32-bit host/guest.

Revision history for this message
Peter Maydell (pmaydell) wrote : Re: [PATCH] linux-user: Allocate extra space for brk in PIE executable

On 16 March 2018 at 10:34, Richard Henderson
<email address hidden> wrote:
> Limit this to 16M; there does not appear to be any special
> support for this in the kernel itself, at least for i686.
>
> Fixes: https://bugs.launchpad.net/bugs/1749393
> Signed-off-by: Richard Henderson <email address hidden>
> ---
>
> Commentary in the launchpad bug suggests 128M gap for x86_64, but that's
> somewhat irrelevant to the given i686 test case. There's certainly nothing
> in the referenced kernel patch that does any more than we had been doing
> without this patch.

I think the 128MB is enforced by mmap_base() in arch/x86/mm/mmap.c:
since x86-64 sots HAVE_ARCH_UNMAPPED_AREA_TOPDOWN, mmap_base is the
highest address in memory where mmap is permitted, and mmap_base()
enforces that it goes at least 128MB below the bottom of the stack
(accounting for rlimit stack size requirements also). Since
binfmt_elf() loads ELF segments via mmap this means that they won't
go too close to the stack. (The commit a87938b2e246 ensures the
gap is honoured by using the full binary size when it does the first
mapping so that mmap picks an address that is sufficiently before the
end of the mmap region for everything to fit.)
The kernel also uses ELF_ET_DYN_BASE to ensure that PIE programs
themselves get loaded clear of the ELF interpreter, which we
don't have any equivalent of (so you can see that different values
of -R result in either the interpreter or the executable getting
loaded at lower addresses.)

PS: do you know what the intention of the
        if (reserved_va) {
            mmap_next_start = reserved_va;
        }
code in linux-user/main.c is? It seems a bit odd to say "ok,
we have reserved a big region. we will start trying to mmap
outside it.", especially when that region covers the full
4G that the guest can access...

thanks
-- PMM

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qemu (Ubuntu):
status: New → Confirmed
Revision history for this message
Matthias Klose (doko) wrote :
Revision history for this message
Richard Henderson (rth) wrote :

Another proposed patch:
https://<email address hidden>/

Changed in qemu (Ubuntu):
assignee: nobody → Richard Henderson (rth)
Revision history for this message
Laurent Vivier (laurent-vivier) wrote :
Changed in qemu:
status: New → Fix Committed
Changed in qemu:
status: Fix Committed → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Will be merged in 20.10 with qemu >=5.0 where this came upstream.

tags: added: qemu-20.10
Changed in qemu (Ubuntu):
status: Confirmed → Triaged
Changed in qemu (Ubuntu):
assignee: Richard Henderson (rth) → Christian Ehrhardt  (paelzer)
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (17.5 KiB)

This bug was fixed in the package qemu - 1:5.0-5ubuntu3

---------------
qemu (1:5.0-5ubuntu3) groovy; urgency=medium

  * d/p/ubuntu/lp-1887763-*: fix TCG sizing that OOMed many small CI
    environments (LP: #1887763)
  * Pick further changes for groovy from debian/master since 5.0-5
    - ati-vga-check-mm_index-before-recursive-call-CVE-2020-13800.patch
      Closes: CVE-2020-13800, ati-vga allows guest OS users to trigger
      infinite recursion via a crafted mm_index value during
      ati_mm_read or ati_mm_write call.
    - revert-memory-accept-mismatching-sizes-in-memory_region_access_valid...patch
      Closes: CVE-2020-13754, possible OOB memory accesses in a bunch of qemu
      devices which uses min_access_size and max_access_size Memory API fields.
      Also closes: CVE-2020-13791
    - exec-set-map-length-to-zero-when-returning-NULL-CVE-2020-13659.patch
      CVE-2020-13659: address_space_map in exec.c can trigger
      a NULL pointer dereference related to BounceBuffer
    - megasas-use-unsigned-type-for-reply_queue_head-and-check-index...patch
      Closes: #961887, CVE-2020-13362, megasas_lookup_frame in hw/scsi/megasas.c
      has an OOB read via a crafted reply_queue_head field from a guest OS user
    - megasas-use-unsigned-type-for-positive-numeric-fields.patch
      fix other possible cases like in CVE-2020-13362 (#961887)
    - megasas-fix-possible-out-of-bounds-array-access.patch
      Some tracepoints use a guest-controlled value as an index into the
      mfi_frame_desc[] array. Thus a malicious guest could cause a very low
      impact OOB errors here
    - nbd-server-avoid-long-error-message-assertions-CVE-2020-10761.patch
      Closes: CVE-2020-10761, An assertion failure issue in the QEMU NBD Server.
      This flaw occurs when an nbd-client sends a spec-compliant request that is
      near the boundary of maximum permitted request length. A remote nbd-client
      could use this flaw to crash the qemu-nbd server resulting in a DoS.
    - es1370-check-total-frame-count-against-current-frame-CVE-2020-13361.patch
      Closes: CVE-2020-13361, es1370_transfer_audio in hw/audio/es1370.c does not
      properly validate the frame count, which allows guest OS users to trigger
      an out-of-bounds access during an es1370_write() operation
    - a few patches from the stable series:
      - fix-tulip-breakage.patch
        The tulip network driver in a qemu-system-hppa emulation is broken in
        the sense that bigger network packages aren't received any longer and
        thus even running e.g. "apt update" inside the VM fails. Fix this.
      - 9p-lock-directory-streams-with-a-CoMutex.patch
        Prevent deadlocks in 9pfs readdir code
      - net-do-not-include-a-newline-in-the-id-of-nic-device.patch
        Fix newline accidentally sneaked into id string of a nic
      - qemu-nbd-close-inherited-stderr.patch
      - virtio-balloon-fix-free-page-hinting-check-on-unreal.patch
      - virtio-balloon-fix-free-page-hinting-without-an-iothread.patch
      - virtio-balloon-unref-the-iothread-when-unrealizing.patch
    - acpi-tmr-allow-2-byte-reads.patch (Closes: #964247)
    - reapply CVE-2020-13253 fixed from upstre...

Changed in qemu (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Robie Basak (racb) wrote :

There's a request for a backport of this fix to be made to Ubuntu 20.04 in duplicate bug 1924231, so I'm adding a task for that.

Changed in qemu (Ubuntu Focal):
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → Medium
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

For Focal:
- SRU Template added to the bug
- MP: https://code.launchpad.net/~paelzer/ubuntu/+source/qemu/+git/qemu/+merge/401771
- PPA: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4535/+packages (still building)

I'd ask anyone affected by this on Focal to give it a try on the PPA and let us know if this fix would work for you.

Revision history for this message
Yasuhiro Horimoto (komainu8) wrote :

Thank you for fixing the problem.

I confirmed that https://bugs.launchpad.net/bugs/1924231 is fixed with https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4535/+packages.

Thank you.

Revision history for this message
Sebastian Unger (sebunger44) wrote :

I'm running qemu-arm version 4.2.1 (Debian 1:4.2-3ubuntu6.17) on Ubuntu 20.04.03, but I seem to still be affected by this (or something very much like it). In my case it is armhf exim4 crashing while creating a chroot on an amd64 host. The final command run from deeply within exim4's postinst is:

/usr/sbin/exim4 -C /var/lib/exim4/config.autogenerated.tmp -bV

and produces

Exim version 4.93 #5 built 28-Apr-2021 13:19:17
Copyright (c) University of Cambridge, 1995 - 2018
(c) The Exim Maintainers and contributors in ACKNOWLEDGMENTS file, 2007 - 2018
Berkeley DB: Berkeley DB 5.3.28: (September 9, 2013)
Support for: crypteq iconv() IPv6 GnuTLS move_frozen_messages DANE DKIM DNSSEC Event I18N OCSP PRDR SOCKS TCP_Fast_Open
Lookups (built-in): lsearch wildlsearch nwildlsearch iplsearch cdb dbm dbmjz dbmnz dnsdb dsearch nis nis0 passwd
Authenticators: cram_md5 plaintext
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore autoreply lmtp pipe smtp
Fixed never_users: 0
Configure owner: 0:0
Size of off_t: 8
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

Interestingly, even

/usr/sbin/exim4 -C /dev/null -bV

produces the same result, so it likely doesn't depend on any configuration at my end and should be reproducible.

Please let me know if there is anything I can do to help debug further.

Should I create a separate ticket?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah Sebastian, a new ticket (with a reference to this bug as being similar) would be preferred.

Changed in qemu (Ubuntu):
assignee: Christian Ehrhardt  (paelzer) → nobody
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
sorry this has fallen through the cracks, but bug 1928075 made me re-discover it and it is time finally to complete that.

tags: added: server-next
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

SRU template updated, PPA rebuilt, Merge requests updated.
Also bundled another bug fix.

Waiting for MR review now.

Changed in qemu (Ubuntu Focal):
status: Triaged → In Progress
assignee: nobody → Christian Ehrhardt  (paelzer)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Uploaded to F-unapproved, waiting for the SRU team to accept it.

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Raphaël, or anyone else affected,

Accepted qemu into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:4.2-3ubuntu6.19 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (4.4 KiB)

Focal

old

$ sudo apt install --reinstall qemu-user-static=1:4.2-3ubuntu6.18
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 21.3 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 qemu-user-static amd64 1:4.2-3ubuntu6.18 [21.3 MB]
Fetched 21.3 MB in 1s (16.4 MB/s)
(Reading database ... 126154 files and directories currently installed.)
Preparing to unpack .../qemu-user-static_1%3a4.2-3ubuntu6.18_amd64.deb ...
Unpacking qemu-user-static (1:4.2-3ubuntu6.18) over (1:4.2-3ubuntu6.18) ...
Setting up qemu-user-static (1:4.2-3ubuntu6.18) ...
Processing triggers for man-db (2.9.1-1) ...

ubuntu@f-1928075-qemuuserstatic:~$ sudo chroot /home/ubuntu/bullseye-arm64 /bin/sh /debootstrap/debootstrap --second-stage
W: Failure trying to run: /sbin/ldconfig
W: See //debootstrap/debootstrap.log for details
ubuntu@f-1928075-qemuuserstatic:~$ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

Upgrade

ubuntu@f-1928075-qemuuserstatic:~$ apt-cache policy qemu-user-static
qemu-user-static:
  Installed: 1:4.2-3ubuntu6.18
  Candidate: 1:4.2-3ubuntu6.19
  Version table:
     1:4.2-3ubuntu6.19 500
        500 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 Packages
 *** 1:4.2-3ubuntu6.18 500
        500 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages
        100 /var/lib/dpkg/status
     1:4.2-3ubuntu6.17 500
        500 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages
     1:4.2-3ubuntu6 500
        500 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages
ubuntu@f-1928075-qemuuserstatic:~$ sudo apt install qemu-user-static
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
  qemu-user-static
1 upgraded, 0 newly installed, 0 to remove and 65 not upgraded.
Need to get 21.3 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 qemu-user-static amd64 1:4.2-3ubuntu6.19 [21.3 MB]
Fetched 21.3 MB in 2s (9092 kB/s)
(Reading database ... 126160 files and directories currently installed.)
Preparing to unpack .../qemu-user-static_1%3a4.2-3ubuntu6.19_amd64.deb ...
Unpacking qemu-user-static (1:4.2-3ubuntu6.19) over (1:4.2-3ubuntu6.18) ...
Setting up qemu-user-static (1:4.2-3ubuntu6.19) ...
Processing triggers for man-db (2.9.1-1) ...
ubuntu@f-1928075-qemuuserstatic:~$ sudo update-binfmts --test --display qemu-aarch64
qemu-aarch64 (enabled):
     package = qemu-user-static
        type = magic
      offset = 0
       magic = \x7f\x45\x4c\x46\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00
        mask = \xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff
 interpreter = /usr/bin/qemu-aarch64-static
    detector =

Test with ne...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI the release of this is slowed down by the slow verification of bug https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1929926

Revision history for this message
frank (frankwu) wrote :

i can confirm that focal-proposed package fixes problems for arm64 and armhf on hostarch amd64

note: tried ppa listed here which fixes for arm64 but breaks armhf: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1928075/comments/15

steps for installing proposed Package:

cat <<EOF >/etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list
# Enable Ubuntu proposed archive

deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed restricted main multiverse universe
EOF

cat <<EOF >/etc/apt/preferences.d/proposed-updates
# Configure apt to allow selective installs of packages from proposed

Package: *
Pin: release a=$(lsb_release -cs)-proposed
Pin-Priority: 400
EOF

apt update
apt install qemu-user-static/focal-proposed

then build 2 bullseye-chroot (arm64 and armhf) including secondstage and no crash happens

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thank you Frank for that extra confirmation,
by now also all the blockers on the other bug fixed are good. I expect this to be released as soon as the SRU Team is back from the Christmas shutdown.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:4.2-3ubuntu6.19

---------------
qemu (1:4.2-3ubuntu6.19) focal; urgency=medium

  * d/p/u/lp-1749393-linux-user-Reserve-space-for-brk.patch: fix static
    use cases needing a lot of brk space (LP: #1749393)
  * d/p/u/lp-1929926-target-s390x-Fix-translation-exception-on-illegal-in.patch:
    fix uretprobe in s390x TCG (LP: #1929926)

 -- Christian Ehrhardt <email address hidden> Mon, 26 Apr 2021 11:11:19 +0200

Changed in qemu (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.