allow repeating hot-unplug requests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Sergio Durigan Junior | ||
Jammy |
Fix Released
|
Undecided
|
Sergio Durigan Junior | ||
Kinetic |
Fix Released
|
Undecided
|
Sergio Durigan Junior | ||
Lunar |
Fix Released
|
Undecided
|
Sergio Durigan Junior | ||
Mantic |
Fix Released
|
Undecided
|
Sergio Durigan Junior |
Bug Description
[ Impact ]
* In the past one could unplug a device, but if that didn't work
it could be tried again. Changes in q35 backend for hotplug
now will only queue one. But if that unplug was very early
the guest will clean the GPEx.status and thereby never see
the event.
* The fix makes ACPI PCI behave the same as pcie which means
allowing to requeue them, but under a rate controlling limit.
[ Test Plan ]
First, let's prepare an LXD VM to serve as our testbed. In this Test Plan we'll be using a Jammy VM.
physical-machine$ lxc launch ubuntu:jammy qemu-bug2018733-jammy --vm -c limits.memory=8GB
physical-machine$ lxc shell qemu-bug2018733-jammy
host-vm# apt update
host-vm# apt install -y libvirt-
host-vm# usermod -a -G libvirt,kvm ubuntu
host-vm# su - ubuntu
In order to reproduce the issue, we will need to quickly attach and detach a disk into/from a VM. To do that, let's use an Ubuntu Cloud image and adjust its kernel's "boot_delay" parameter to give us time to perform the necessary operations.
host-vm$ wget https:/
host-vm$ qemu-img create disk.img 1G
host-vm$ sudo chown libvirt-qemu:kvm lunar-server-
host-vm$ sudo chmod +x /home/ubuntu
host-vm$ sudo virt-customize -a lunar-server-
host-vm$ cat > test-vm.xml << __EOF__
<domain type='kvm' id='3'>
<name>
<memory unit='GiB'
<currentMemory unit='GiB'
<vcpu placement=
<resource>
<partition>
</resource>
<os>
<type arch='x86_64' machine=
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<vmcoreinfo state='on'/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback=
<topology sockets='1' cores='1' threads='1'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='x2apic'/>
<feature policy='require' name='hypervisor'/>
</cpu>
<clock offset='utc'>
<timer name='pit' tickpolicy=
<timer name='rtc' tickpolicy=
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>
<on_reboot>
<on_crash>
<devices>
<emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/
<target dev='vda' bus='virtio'/>
<alias name='virtio-
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</disk>
<controller type='usb' index='0' model='none'>
<alias name='usb'/>
</controller>
<controller type='pci' index='0' model='pci-root'>
<alias name='pci.0'/>
</controller>
<controller type='ide' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<serial type='pty'>
<source path='/dev/pts/0'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/0'>
<source path='/dev/pts/0'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<memballoon model='virtio'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memballoon>
<rng model='virtio'>
<backend model='
<alias name='rng0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</rng>
</devices>
</domain>
__EOF__
host-vm$ virsh define test-vm.xml
host-vm$ virsh start test-vm
host-vm$ virsh console test-vm
Wait for the VM to boot, log into it (user is "root", password is "1234"), and execute:
nested-vm# sed -i 's/^GRUB_
nested-vm# update-grub
nested-vm# reboot
Keep this terminal open, and quickly open another terminal, log into the host VM (running on LXD, named qemu-bug2018733-jammy in this Test Plan).
physical-machine$ lxc shell qemu-bug2018733-jammy
host-vm# su - ubuntu
Closely monitor the reboot process of the nested VM on the first terminal. When the VM starts booting again, switch to the second terminal (inside the host VM) and issue:
host-vm$ virsh attach-disk test-vm /home/ubuntu/
You will notice that the detach operation apparently succeeded, but you can confirm that it did not by doing:
host-vm$ virsh domblklist test-vm
You will notice that the new disk (named "vpx") is still attached to the VM. If you try to detach it again, you will get an error:
host-vm$ virsh detach-disk test-vm --live vdx
error: Failed to detach disk
error: internal error: unable to execute QEMU command 'device_del': Device virtio-disk23 is already in the process of unplug
[ Previous Test Plan ]
1. Modify the Ubuntu cloudguest image to have the boot_delay=100 added to the kernel args to simulate a slowly host
2. Start the Ubuntu domain and connect to the serial console to see it boot
3. Wait until the first messages appear in the console. This is around T+50sec from the virsh start. But note that the guest boot is slowed down with the boot_delay=100 kernel arg.
4. From a second terminal attach an additional disk to the guest. It succeeds.
5. Wait a second
6. Detach the additional disk from the guest. The virsh command hangs for couple of seconds, but then succeeds.
7. Check the domain XML, the disk is still attached
8. Check the lsblk command from the guest (after it is fully booted). The disk is still attached.
9. Check the virsh domblklist output. The disk is still attached.
10. Try to detach the disk again. It fails with "error: device not found: no target device"
An flow of these with commands and example output can be seen at:
https:/
[ Where problems could occur ]
* Depending how far we backport this (at least v6.2 in Jammy)
we need to double check if the used callbacks and settings work
the same back then. While this can be just "tested" it should
also get a review of related changes to be sure.
* The change and thereby regressions are limited to acpi PCI
hotplug and for a software so complex as qemu it is always good
to be able to clearly point to a small subset of the use cases
to know what to look out for.
[ Other Info ]
* n/a
-----------
This was kindly reported by ~kashyapc
I only convert this into a bug for tracking.
---
Report:
This [1] QEMU patch solves a genuine bug [2] involving disk hot-
unplug. More details in the commit message, and also in the bug that is
linked here[2].
I have also flagged the fix for QEMU 8.0 stable[3], and tested that the
fix itself works[4].
Please pick up the fix[1] once it merges.
[1] https:/
— acpi: pcihp: allow repeating hot-unplug requests
[2] https:/
[3] https:/
[4] https:/
--- ^^ report
--- vv extra context
Note:
- [2] + [4] have tests steps we can use for SRU verification.
- This has landed upstream by now
https:/
- Also landed in 8.0 stable staging as
https:/
Related branches
- git-ubuntu bot: Approve
- Lena Voytek (community): Approve
- Canonical Server Core Reviewers: Pending requested
- Canonical Server Reporter: Pending requested
-
Diff: 105 lines (+83/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/allow-repeating-hot-unplug-requests.patch (+74/-0)
- git-ubuntu bot: Approve
- Lena Voytek (community): Approve
- Canonical Server Core Reviewers: Pending requested
- Canonical Server Reporter: Pending requested
-
Diff: 105 lines (+83/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/allow-repeating-hot-unplug-requests.patch (+74/-0)
- git-ubuntu bot: Approve
- Lena Voytek (community): Approve
- Canonical Server Reporter: Pending requested
-
Diff: 105 lines (+83/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/allow-repeating-hot-unplug-requests.patch (+74/-0)
- git-ubuntu bot: Approve
- Lena Voytek (community): Approve
- Christian Ehrhardt : Pending requested
- Canonical Server Reporter: Pending requested
-
Diff: 105 lines (+83/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/allow-repeating-hot-unplug-requests.patch (+74/-0)
tags: | added: server-todo |
description: | updated |
description: | updated |
Changed in qemu (Ubuntu Mantic): | |
assignee: | nobody → Sergio Durigan Junior (sergiodj) |
Changed in qemu (Ubuntu Lunar): | |
assignee: | nobody → Sergio Durigan Junior (sergiodj) |
Changed in qemu (Ubuntu Jammy): | |
assignee: | nobody → Sergio Durigan Junior (sergiodj) |
Changed in qemu (Ubuntu Kinetic): | |
assignee: | nobody → Sergio Durigan Junior (sergiodj) |
description: | updated |
description: | updated |
Changed in qemu (Ubuntu Jammy): | |
status: | Confirmed → In Progress |
Changed in qemu (Ubuntu Kinetic): | |
status: | Confirmed → In Progress |
Changed in qemu (Ubuntu Lunar): | |
status: | Confirmed → In Progress |
(Thanks for filing this, Christian!)
My expanded version of successful test result with the patch is here:
https:/ /lists. nongnu. org/archive/ html/qemu- devel/2023- 05/msg01070. html