QEMU mishandling of SO_PEERSEC forces systemd into tight loop

Bug #1823790 reported by Matthias Lüscher on 2019-04-08
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Laurent Vivier

Bug Description

While building Debian images for embedded ARM target systems I detected that QEMU seems to force newer systemd daemons into a tight loop.

My setup is the following:

Host machine: Ubuntu 18.04, amd64
LXD container: Debian Buster, arm64, systemd 241
QEMU: qemu-aarch64-static, 4.0.0-rc2 (custom build) and 3.1.0 (Debian 1:3.1+dfsg-7)

To easily reproduce the issue I have created the following repository:
https://github.com/lueschem/edi-qemu

The call where systemd gets looping is the following:
2837 getsockopt(3,1,31,274891889456,274887218756,274888927920) = -1 errno=34 (Numerical result out of range)

Furthermore I also verified that the issue is not related to LXD.
The same behavior can be reproduced using systemd-nspawn.

This issue reported against systemd seems to be related:
https://github.com/systemd/systemd/issues/11557

Peter Maydell (pmaydell) on 2019-04-09
tags: added: linux-user
tags: added: arm
Peter Maydell (pmaydell) wrote :

As described on the systemd issue, the syscall we're getting wrong here is getsockopt(fd, SOL_SOCKET, SO_PEERSEC, ...). Our linux-user/syscall.c:do_getsockopt() doesn't have any special case code for the payload on this function, so we treat it as if it were just an integer payload, which is not correct here.

Unfortunately I can't find any documentation on exactly what SO_PEERSEC does or what the payload format is, which makes it pretty hard to fix this bug :-( It's not listed in the socket(7) manpage -- https://linux.die.net/man/7/socket -- which is where I'd expect to find it, and the kernel source code is pretty confusing in this area.

summary: - QEMU forces systemd into tight loop
+ QEMU mishandling of SO_PEERSEC forces systemd into tight loop
Matthias Lüscher (m-luescher) wrote :

This is probably the tight loop that gets triggered:
https://github.com/systemd/systemd/commit/217d89678269334f461e9abeeffed57077b21454

It looks like the previous implementation was just a bit more "tolerant".

Matthias Lüscher (m-luescher) wrote :

I have just studied a bit the systemd code and this brought me to the following idea/temporary workaround: What about returning -1 (error) and setting errno when getsockopt(fd, SOL_SOCKET, SO_PEERSEC, ...) gets called? This would then let systemd know that SO_PEERSEC is not (yet) implemented.

Fritz Katze (fritz-the-cat) wrote :

I filed the duplicate #1840252 of this bug.

I think that the options SO_PEERCRED and SO_PEERSEC belong into the context of SELINUX. So maybe the format of the paylod can be found in the sources of libselinux?

I'd like to compile qemu with a local hack to work around my current problem. Something like Matthias Lüscher suggested.

@Peter Maydell: could you point me to the location in the qemu source where I could apply such a hack?

Fritz Katze (fritz-the-cat) wrote :

I patched linux-user/syscall.c (see below, branch stable-2.11) which works around my problem.
So far so good, but the qemu-arm that i compiled is terribly slow compared to the one that came with Ubuntu 18.04. Any hints?
I configured as this:
./configure --static --enable-kvm --target-list=arm-linux-user

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 74d56e2ee6..4fa9a09b12 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3185,6 +3185,8 @@ static abi_long do_getsockopt(int sockfd, int level, int optname,
         case TARGET_SO_SNDTIMEO:
         case TARGET_SO_PEERNAME:
             goto unimplemented;
+ case TARGET_SO_PEERSEC: /* added to escape infinite loop */
+ goto unimplemented;
         case TARGET_SO_PEERCRED: {
             struct ucred cr;
             socklen_t crlen;

Tobias Koch (tobijk) wrote :

I'm a bit surprised that this bug doesn't get more attention, as it makes it very hard to run qemu-emulated containers of Bionic hosted on Bionic. Running such containers is a common way to cross-compile packages for foreign architectures in the absence of sufficiently powerful target HW.

The documentation on SO_PEERSEC is indeed sparse, but I do want to second Fritz in his approach. I don't see a reason, why treating the payload as incorrect and throwing it back at the application is better than handling it and saying it is not implemented (which is the case).

Arguably, applications should be fixed to handle the error correctly, but I'm afraid that is a can of worms. I have encountered the same problem with systemd, apt and getent. Would the maintainers be open to an SRU request on QEMU for this?

Tobias Koch (tobijk) on 2020-01-29
Changed in qemu:
status: New → Confirmed
Changed in qemu:
assignee: nobody → Laurent Vivier (laurent-vivier)
Laurent Vivier (laurent-vivier) wrote :

Could you test the attached patch?

Tobias Koch (tobijk) wrote :

Thanks, Laurent! I'll get back to you, asap.

> Could you test the attached patch?
>

Works great!

This is my test setup:

Host machine: Ubuntu 18.04, amd64
LXD container: Debian Buster, arm64, systemd 241
QEMU: qemu-aarch64(-static), compiled from source (4.2.0), patched with
your patch.

Many thanks!
Matthias

Tobias Koch (tobijk) wrote :

I carried out the following test:

* fetched the QEMU coming with 18.04,
* added this patch,
* built an LXD container with arch arm64 and the patched qemu-aarch64-static inside,
* launched it on amd64

Previously various systemd services would not come up properly, now they are running like a charm. The only grief I have is that network configuration does not work, but that is due to

    # ip addr
    Unsupported setsockopt level=270 optname=11

which is a different story.

zebul666 (zebul666) wrote :

Well, it's kind of irrelevant but I am trying that on archlinux and this does not work for me.

Using systemd-244.2-1 and qemu-user-static-4.2 that I built with Laurent's patch. May be I have done something wrong ?

I still get that error that leads me here:

Failed to enqueue loopback interface start request: Operation not supported
Caught <SEGV>, dumped core as pid 3.
Exiting PID 1...

I am trying to boot with systemd-nspawn an archlinux-arm built for a rpi0. That's fine if I don't boot it.

Charlie Sharpsteen (sharpie) wrote :

Laurent's patch worked for me as well.

I grabbed the source for the Debian 10 qemu-user-static package, qemu_3.1+dfsg-8+deb10u3, applied the patch and re-built the qemu-arm-static binary. Copying the new binary into a Docker image based on arm32v7/debian:10-slim allowed /sbin/init to bring up the container with a responsive systemctl command.

Prior to the patch, systemd did not start any services inside the container and systemctl would hang when executed directly.

Thanks!
-Charlie

This seems to be the error reported in https://bugs.launchpad.net/qemu/+bug/1857811

Changed in qemu:
status: Confirmed → Fix Committed
Changed in qemu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers