[UBUNTU 21.04] s390x/s390-virtio-ccw: Reset PCI devices during subsystem reset
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Skipper Bug Screeners | ||
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Canonical Server | ||
Focal |
Fix Released
|
Undecided
|
Unassigned | ||
Groovy |
Fix Released
|
Undecided
|
Unassigned | ||
Hirsute |
Fix Released
|
Undecided
|
Canonical Server |
Bug Description
[Impact]
Symptom: PCI devices are unavailable after a subsystem reset
Problem: When a subsystem reset event occurs (e.g. via kexec) PCI
[Test Case]
# Prep a guest and wait until it booted
$ apt install uvtool-libvirt
$ uvt-simplestrea
$ uvt-kvm create --disk 5 --password=ubuntu testguest release=focal arch=s390x label=daily
$ virsh console testguest
# lspci in guest shows nothing yet (expected)
# Add virtio device
$ cat > virtio-pci.xml << EOF
<interface type='network'>
<source network='default'/>
<model type='virtio'/>
<address type='pci'/>
<rom bar='off' file=''/>
</interface>
EOF
$ virsh attach-device testguest virtio-pci.xml
# lspci in guest now shows the device
ubuntu@testguest:~$ lspci
0001:00:00.0 Ethernet controller: Red Hat, Inc. Virtio network device
# verify that a "normal" reboot does not loose the device
ubuntu@testguest:~$ sudo reboot
...
ubuntu@testguest:~$ lspci
0001:00:00.0 Ethernet controller: Red Hat, Inc. Virtio network device
# Kexec into a kernel (can be the same)
ubuntu@testguest:~$ sudo apt install kexec-tools
ubuntu@testguest:~$ sudo kexec --load /boot/vmlinuz --initrd=
ubuntu@testguest:~$ sudo kexec --exec
# Log in and recheck lspci - it will be empty (wrong)
# With the Fix that will show the pci device again
ubuntu@testguest:~$ lspci
# Note: A Reboot will get the device back (in old and new case)
[Where problems could occur]
* The patch is gladly small - it affects the list of devices that will
be reset them. By extending this list obivously more devices will be
reset - therefore the activity of a "subsystem_reset" will cover more
devices.
Regressions (let us hope not) would happen there. For example think
there is a buggy PCI device that no one cared about before. Formerly it
would not have been reset, but now it is. If that reset fails badly you
have a regression.
Fortunately PCI devices are still uncommon on s390x, so even if (I
doubt) there is a regression it would affect a small fraqction of users
only.
These kind of resets happen on load (kexec, reboot, start) and that is
the place to look out for regressions.
[Other Info]
* n/a
---
Description: s390x/s390-
Symptom: PCI devices are unavailable after a subsystem reset
Problem: When a subsystem reset event occurs (e.g. via kexec) PCI
Solution: Add the s390 PCI host bridge to the list of devices to be
Reproduction: kexec on an s390x guest with PCI devices
db08244a3a7e s390x/s390-
This fix need to be applied to qemu for focal (20.04) and groovy (20.10).
Related branches
- Sergio Durigan Junior (community): Approve
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 73 lines (+51/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/lp-1907656-s390x-s390-virtio-ccw-Reset-PCI-devices-during-subsy.patch (+43/-0)
- Sergio Durigan Junior (community): Needs Fixing
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 115 lines (+57/-15)4 files modifieddebian/changelog (+8/-0)
debian/patches/series (+1/-0)
debian/patches/ubuntu/lp-1907656-s390x-s390-virtio-ccw-Reset-PCI-devices-during-subsy.patch (+43/-0)
debian/rules (+5/-15)
tags: | added: architecture-s39064 bugnameltc-190224 severity-high targetmilestone-inin--- |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → qemu (Ubuntu) |
Changed in qemu (Ubuntu): | |
assignee: | Skipper Bug Screeners (skipper-screen-team) → Canonical Server Team (canonical-server) |
Changed in ubuntu-z-systems: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: qemu-21.04 |
Changed in ubuntu-z-systems: | |
status: | Triaged → In Progress |
Changed in qemu (Ubuntu Groovy): | |
status: | Triaged → In Progress |
Changed in qemu (Ubuntu Focal): | |
status: | Triaged → In Progress |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | Fix Committed → Fix Released |
This is in qemu 5.2 which I'm already working on for hirsute.
So -devel should be fixed soon (althrough testing on 5.2 will consume a few days).
Three questions for the following SRU as this was flagged for Focal and Groovy as well.
1. How urgent/severe is it, do we need to move heaven and earth to get this completed before the Christmas downtime or can this be SRU released in January?
2. I see you said for repro "kexec on an s390x guest with PCI devices". But I'm sure you already have a script and or guest xmls and whatever else that is related. Anything I don't have to come up from-scratch will make handling this faster.
3. Do I need any special HW and/or configuration to achieve "s390x guest with PCI devices" like real PCI ?!? - or is it enough to try to force e.g. virtio-net-pci in? Again sample XMls and commands will help.