[Ubuntu 21.04] IBM Z go binary crashes under Qemu

Bug #1922010 reported by bugproxy on 2021-03-31
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
High
Skipper Bug Screeners
qemu (Ubuntu)
Undecided
Canonical Server Team
Focal
Undecided
Unassigned
Groovy
Undecided
Unassigned
Hirsute
Undecided
Canonical Server Team

Bug Description

Running the IBM Z go binary under qemu segfaults.

---uname output---
Linux 97f388a80d88 5.8.0-45-generic #51-Ubuntu SMP Fri Feb 19 13:24:51 UTC 2021 s390x s390x s390x GNU/Linux

---Steps to Reproduce---
 qemu-s390x-static -E LD_LIBRARY_PATH=./ ./go

Userspace tool common name: qemu

Userspace rpm: qemu-s390x version 5.0.0 (Debian 1:5.0-5ubuntu9.6)

The userspace tool has the following bit modes: 64

Userspace tool obtained from project website: na

Patch by IBM has been accepted upstream already. The patch has been tested on Qemu stable branch (v 5.0) as well and found to fix the problem.

https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg09327.html

focal (20.04LTS) 1:4.2-3ubuntu6.14 [security]: amd64
                 1:4.2-3ubuntu6 [ports]
focal-updates 1:4.2-3ubuntu6.14

groovy (20.10) 1:5.0-5ubuntu9.6 [security]: amd64
               1:5.0-5ubuntu9 [ports]
groovy-updates 1:5.0-5ubuntu9.6

hirsute (metapackages)1:5.2+dfsg-9ubuntu1

The patch also fits onto Qemu 4.2 as in 20.04. However, problem couldn't be reproduced with a Qemu 4.2 build.

Related branches

bugproxy (bugproxy) on 2021-03-31
tags: added: architecture-s39064 bugnameltc-192246 severity-critical targetmilestone-inin2104
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → qemu (Ubuntu)
Frank Heimes (fheimes) on 2021-03-31
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Changed in qemu (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Canonical Server Team (canonical-server)
Changed in ubuntu-z-systems:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the Report!
That is:

commit 23fff7a17f47420797ac6480147941612152a9ad
Author: Andreas Krebbel <email address hidden>
Date: Wed Mar 24 19:51:28 2021 +0100

    linux-user/s390x: Use the guest pointer for the sigreturn stub

    When setting up the pointer for the sigreturn stub in the return
    address register (r14) we currently use the host frame address instead
    of the guest frame address.

    Note: This only caused problems if Qemu has been built with
    --disable-pie (as it is in distros nowadays). Otherwise guest_base
    defaults to 0 hiding the actual problem.

    Signed-off-by: Andreas Krebbel <email address hidden>
    Reviewed-by: Laurent Vivier <email address hidden>
    Reviewed-by: Richard Henderson <email address hidden>
    Message-Id: <email address hidden>
    Signed-off-by: Laurent Vivier <email address hidden>

$ git tag --contains 23fff7a17f47420797ac6480147941612152a9ad
v6.0.0-rc1

There are currently a bunch of qemu SRUs in flight which have to "clear the queue" first.
If until then you found that it also affects Focal we can fix that as well - let us know in this case.

Changed in qemu (Ubuntu Hirsute):
status: New → Triaged
Changed in qemu (Ubuntu Groovy):
status: New → Triaged
Changed in qemu (Ubuntu Focal):
status: New → Incomplete
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Reproduced using go/qemu from the Archive in 21.04.
(I hope that is the error you meant)

$ apt install qemu-user-static
$ apt install golang-go
$ qemu-s390x-static /usr/bin/go
fatal error: cas5
runtime: panic before malloc heap initialized

runtime stack:
runtime.throw(0x61427a, 0x4)
 /usr/lib/go-1.16/src/runtime/panic.go:1117 +0x70 fp=0x4000800690 sp=0x4000800668 pc=0x4a5d0
runtime.check()
 /usr/lib/go-1.16/src/runtime/runtime1.go:232 +0x3ec fp=0x40008006e0 sp=0x4000800690 pc=0x5d25c
runtime.rt0_go(0x38, 0x5, 0xa, 0x6, 0x1000, 0x7, 0x4000801000, 0x8, 0x0, 0x9, ...)
 /usr/lib/go-1.16/src/runtime/asm_s390x.s:141 +0xb2 fp=0x40008007a0 sp=0x40008006e0 pc=0x7ef32

I have a PPA at https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4514/+packages
And with that I'll see if this behavior changed ...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hmm, no I get the same output with the new qemu version.
Direct /usr/bin/go works fine
I did use qemu-s390x-static on s390x, it didn't have to do a lot of emulation.

Is this only a problem to "real" emulation through qemu-user-static e.g. when using qemu-s390x-static on x86?

If so which environment/debootstrap/... did you use to have the s390x go that you execute?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Finally - how exactly would the bad case crash look like?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I tried to use it the same way I'd do arm@x86 but that failed.
Maybe you can share what environment you used bfore I start wasting time to reproduce this.
And OTOH if this really only hits such a case it might be too edge-case for an SRU anyway.
Then I'd even more appreciate a fast response to get it at least into Hirsute before it fully closes.

# not working WIP repro state
$ sudo su -
$ apt install qemu-user-static debootstrap schroot
$ sudo qemu-debootstrap --arch=s390x hirsute s390x-ubuntu
$ cat > /etc/schroot/chroot.d/s390x-ubuntu << EOF
[s390x-ubuntu]
type=directory
description=Ubuntu 21.04 Hirsute (s390x)
directory=$(pwd)/s390x-ubuntu
root-users=$(whoami)
root-groups=$(whoami)
users=$(whoami)
groups=$(whoami)
EOF
$ schroot -c s390x-ubuntu

Obviously, if you'd have any static executable we could throw at qemu-s390x-static that would simplify things a lot.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2021-04-01 06:47 EDT-------
I've debugged the problem using a go binary + libs from an IBM Z fedora and run it on a 20.10 x86 system.

Here is the original bug report I got. There the experimental cross platform feature of Docker has been used. That's easy to do but leaves you in a pretty much debug-hostile environment.

Clean Ubuntu20.10, x86_64 hardware.

# apt-get update
# apt-get install docker.io qemu-user-static -y
# qemu-s390x-static --version
qemu-s390x version 5.0.0 (Debian 1:5.0-5ubuntu9.6)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers

Then just mimic image build steps, the steps are used from https://github.com/containers/buildah/blob/master/contrib/buildahimage/upstream/Dockerfile#L20

Enable experimental feature for docker.
# vi /etc/docker/daemon.json
add
{
"experimental": true
}
# systemctl restart docker

Run the container:
# docker run -ti --platform linux/s390x registry.fedoraproject.org/fedora:latest bash

And inside the container:
# uname -a
Linux 97f388a80d88 5.8.0-45-generic #51-Ubuntu SMP Fri Feb 19 13:24:51 UTC 2021 s390x s390x s390x GNU/Linux

# export GOPATH=/root/buildah
# useradd build; yum -y update; yum -y reinstall shadow-utils; yum -y install --enablerepo=updates-testing \
make \
golang \
bats \
btrfs-progs-devel \
device-mapper-devel \
glib2-devel \
gpgme-devel \
libassuan-devel \
libseccomp-devel \
git \
bzip2 \
xz \
go-md2man \
runc \
fuse-overlayfs \
fuse3 \
containers-common; \
mkdir /root/buildah; \
git clone https://github.com/containers/buildah /root/buildah/src/github.com/containers/buildah; \
cd /root/buildah/src/github.com/containers/buildah
# make

The result is:

/bin/sh: line 1: 956 Segmentation fault (core dumped) go help mod > /dev/null 2>&1
go build -ldflags '-X main.GitCommit=915de2e2 -X main.buildInfo=1616142176 -X main.cniVersion=v0.8.1 ' -gcflags "" -o bin/buildah -tags "seccomp " ./cmd/buildah
make: *** [Makefile:66: bin/buildah] Illegal instruction (core dumped)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks Andreas, not easy to use on a system not allowed to reach most of the outside (like registry.fedoraproject.org), but convincing and complete - thanks!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since it is upstream, can be reproduced (by you more easily than me) it makes sense.
I got no issues with the PPA running other things and a review on the MP.
So I uploaded this to Hirsute hoping that it will still slip in before release.

But looking for an SRU I wonder how real the case is if it is so complex to trigger.
It only affects groovy users (not happening on Focal as you said) - so the "can't upgrade from an LTS" argument does not apply here either.

So for now I'd think the SRUs are a "Won't Fix", but I can be convinced if you think it really is important. If you then would have simpler or more use cases to make that argument that would be helpful to later also convince the SRU Team.

Changed in qemu (Ubuntu Groovy):
status: Triaged → Won't Fix
Changed in qemu (Ubuntu Hirsute):
status: Triaged → In Progress
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:5.2+dfsg-9ubuntu2

---------------
qemu (1:5.2+dfsg-9ubuntu2) hirsute; urgency=medium

  * d/p/u/lp-1922010-linux-user-s390x-Use-the-guest-pointer-for-the-sigre.patch:
    fix go in qemu-s390x-static (LP: #1922010)

 -- Christian Ehrhardt <email address hidden> Wed, 31 Mar 2021 10:01:40 +0200

Changed in qemu (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2021-06-14 10:39 EDT-------
Patch is not needed on focal I think.

Revision history for this message
Frank Heimes (fheimes) wrote :

Hi Andreas, yes, that is what we believe, too - thx for confirming.
I'll change the focal entry to Invalid and overall close this bug as Fix Released.

Changed in qemu (Ubuntu Focal):
status: Incomplete → Invalid
Changed in ubuntu-z-systems:
status: Triaged → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2021-06-15 02:44 EDT-------
IBM Bugzilla status->closed, Fix Released

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers