QEMU coroutines fail with LTO on non-x86_64 architectures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
qemu (Fedora) |
Confirmed
|
Medium
|
|||
qemu (Ubuntu) |
Fix Released
|
Medium
|
Paride Legovini | ||
Jammy |
Fix Released
|
Undecided
|
Michał Małoszewski |
Bug Description
[Impact]
-QEMU on Jammy (22.04) is affected.
-Emulation of riscv64 on arm64 fails.
-Emulation of arm64/armhf on arm64, ppc64el, s390x fails.
-Problem when trying to install the Ubuntu arm64 ISO image in a VM.
[Fix]
-There is no entry in debian/rules, where if the Debian architecture of the host machine is not amd64, LTO should be disabled to prevent QEMU coroutines from failing and marked to be exported to all child processes created from that shell.
-Adding DEB_BUILD_
[Test Plan]
** Reproduction **
- Detailed tests steps on AWS in comment #87.
arm64 on arm64, ubuntu cloud image
wget https:/
sudo apt install --yes --no-install-
cp /usr/share/
cp /usr/share/
qemu-system-
-machine virt -nographic \
-smp 4 -m 4G \
-cpu cortex-a57 \
-pflash flash0.img -pflash flash1.img \
-drive file=jammy-
-device virtio-
...
BdsDxe: failed to load Boot0001 "UEFI Misc Device" from VenHw(93E34C7E-
qemu-system-
Segmentation fault (core dumped)
As mentioned this only happens on specific systems. Paride has a reliable reproducer (see comment 39), he will perform the SRU validation for arm64.
https:/
[Where problems could occur]
Any code change might change the behavior of the package in a specific situation and cause other errors.
Possible, but rather unlikely regression source is the fact that the qemu will be rebuilt against newer versions of its build dependencies, on Jammy. There might also be some other warning in the code fixed in later versions not identified by us. It is unlikely, but there might be an architecture where the test plan will fail.
Some updates can also break the functionality of an introduced fix.
Anyway, the fix is quite not complex and problems can be detected easily.
[Other Info]
This change in itself does not change QEMU source code (ie, no functional change), but it does change its object code, since the compiler build options are now changed (in addition to build dependencies versions).
This change has been in Kinetic since September, 2022 (~6 months).
[Original Bug Description]
Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
other architectures as well.
The following case triggers on a Raspberry Pi4 running with arm64 on
Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
but that is what we have seen it so far.
$ wget https:/
$ tar xzf UbuntuFocal-
$ ./run_riscvVM.sh
(wait ~2 minutes)
[ OK ] Reached target Local File Systems (Pre).
[ OK ] Reached target Local File Systems.
qemu-system-
This is often, but not 100% reproducible and the cases differ slightly we
see either of:
- qemu-system-
- qemu-system-
Rebuilding working cases has shown to make them fail, as well as rebulding
(or even reinstalling) bad cases has made them work. Also the same builds on
different arm64 CPUs behave differently. TL;DR: The full list of conditions
influencing good/bad case here are not yet known.
[1]: https:/
[2]: http://
--- --- original report --- ---
I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days ago broke it. Now when I launch it, it hits an assertion:
OpenSBI v0.6
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|____
| |
|_|
...
Found /boot/extlinux/
Retrieving file: /boot/extlinux/
618 bytes read in 2 ms (301.8 KiB/s)
RISC-V Qemu Boot Options
1: Linux kernel-5.5.0-dirty
2: Linux kernel-5.5.0-dirty (recovery mode)
Enter choice: 1: Linux kernel-5.5.0-dirty
Retrieving file: /boot/initrd.
qemu-system-
./run.sh: line 31: 1604 Aborted (core dumped) qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin -device virtio-blk-devi
ce,drive=hd0 -object rng-random,
ce virtio-
Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04 (fully updated).
Think you have everything already, but just in case:
$ lsb_release -rd
Description: Ubuntu Hirsute Hippo (development branch)
Release: 21.04
$ uname -a
Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
(note this is a VM running on macOS/M1)
$ apt-cache policy qemu
qemu:
Installed: 1:5.2+dfsg-9ubuntu1
Candidate: 1:5.2+dfsg-9ubuntu1
Version table:
*** 1:5.2+dfsg-9ubuntu1 500
500 http://
100 /var/lib/
ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: qemu 1:5.2+dfsg-9ubuntu1
ProcVersionSign
Uname: Linux 5.11.0-11-generic aarch64
ApportVersion: 2.20.11-0ubuntu61
Architecture: arm64
CasperMD5CheckR
CurrentDmesg:
Error: command ['pkexec', 'dmesg'] failed with exit code 127: polkit-
Error executing command as another user: Not authorized
This incident has been reported.
Date: Mon Mar 29 02:33:25 2021
Dependencies:
KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND
Lspci-vt:
-[0000:00]-+-00.0 Apple Inc. Device f020
+-01.0 Red Hat, Inc. Virtio network device
+-05.0 Red Hat, Inc. Virtio console
+-06.0 Red Hat, Inc. Virtio block device
\-07.0 Red Hat, Inc. Virtio RNG
Lsusb: Error: command ['lsusb'] failed with exit code 1:
Lsusb-t:
Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_
LANG=C.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: console=hvc0 root=/dev/vda
SourcePackage: qemu
UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago)
acpidump:
Error: command ['pkexec', '/usr/share/
Error executing command as another user: Not authorized
This incident has been reported.
Related branches
- Paride Legovini (community): Approve
- git-ubuntu import: Pending requested
-
Diff: 33 lines (+14/-0)2 files modifieddebian/changelog (+7/-0)
debian/rules (+7/-0)
- git-ubuntu bot: Approve
- Paride Legovini (community): Approve
- Canonical Server Reporter: Pending requested
-
Diff: 158 lines (+111/-1)5 files modifieddebian/changelog (+16/-0)
debian/patches/series (+2/-0)
debian/patches/ubuntu/lp1988710-opensbi-Makefile-fix-build-with-binutils-2.38.patch (+62/-0)
debian/patches/ubuntu/lp1988710-silence-openbios-array-bounds-false-positive.patch (+23/-0)
debian/rules (+8/-1)
CVE References
Changed in qemu (Ubuntu): | |
status: | Incomplete → New |
Changed in qemu (Ubuntu): | |
importance: | Low → Medium |
tags: | added: lto server-todo |
Changed in qemu (Ubuntu): | |
status: | Triaged → Fix Released |
Changed in qemu (Fedora): | |
importance: | Unknown → Medium |
status: | Unknown → In Progress |
Changed in qemu (Fedora): | |
status: | In Progress → Confirmed |
summary: |
- Coroutines are racy for risc64 emu on arm64 - crash on Assertion + QEMU coroutines fail with LTO on non-x86_64 architectures |
description: | updated |
description: | updated |
description: | updated |
tags: | added: verification-needed |
tags: | added: verification-needed-jammy |
tags: | removed: verification-needed verification-needed-jammy |
description: | updated |
FWIW, I just now built qemu-system-riscv64 from git ToT and that works fine.