Activity log for bug #1893778

Date Who What changed Old value New value Message
2020-09-01 12:39:38 bugproxy bug added bug
2020-09-01 12:39:40 bugproxy tags architecture-s39064 bugnameltc-187974 severity-high targetmilestone-inin2004
2020-09-01 12:39:41 bugproxy ubuntu: assignee Skipper Bug Screeners (skipper-screen-team)
2020-09-01 12:39:44 bugproxy affects ubuntu linux (Ubuntu)
2020-09-01 13:07:57 Frank Heimes bug task added ubuntu-z-systems
2020-09-01 13:12:15 Frank Heimes nominated for series Ubuntu Focal
2020-09-01 13:12:15 Frank Heimes bug task added linux (Ubuntu Focal)
2020-09-01 13:12:15 Frank Heimes nominated for series Ubuntu Groovy
2020-09-01 13:12:15 Frank Heimes bug task added linux (Ubuntu Groovy)
2020-09-01 13:13:42 Frank Heimes linux (Ubuntu Groovy): status New Fix Released
2020-09-01 13:29:36 Frank Heimes linux (Ubuntu Focal): assignee Frank Heimes (fheimes)
2020-09-01 13:29:45 Frank Heimes ubuntu-z-systems: assignee Skipper Bug Screeners (skipper-screen-team)
2020-09-01 13:29:54 Frank Heimes ubuntu-z-systems: importance Undecided High
2020-09-03 10:01:04 Frank Heimes linux (Ubuntu Focal): status New In Progress
2020-09-03 10:01:09 Frank Heimes ubuntu-z-systems: status New In Progress
2020-09-03 10:02:15 Frank Heimes description When a PCI device (including virtio-pci for which this is easiest to test) is hot-plugged while Linux is still booting, it can be detected as an entry in CLP List PCI Functions (basically equivalent to boot time probing on other architectures) and with the hot-plug event. In this case the hot-plug event will be stale but Linux still tried to add and enable the device leading a) to a duplicate entry in zPCI internal device list b) an attempt to enable the device witha stale function handle Part b) would lead to the device being place in the error state and make it unusable. This can most easily be reproduced using KVM and doing # sudo virsh start myguest && sudo virsh attach-device myguest hotplug_pci_block.xml Where hotplug_pci_block.xml looks like the following: <disk device="disk" type="file"> <driver name="qemu" type="raw" /> <address type="pci"> <zpci fid="4660" uid="4660" /> </address> <source file="testdisk.img" /> <target bus="virtio" dev="vdt" /> </disk> The problem is fixed with the 3-line upstream commit b76fee1bc56c31a9d2a49592810eba30cc06d61a s390/pci: ignore stale configuration request event I also confirmed that as of the focal tag Ubuntu-5.4.0-46.50 this cherry-picks cleanly. SRU Justification: ================== [Impact] * If a PCI device (incl. virtio-pci) is hot-plugged during boot-up on s390x, it can be detected as an entry in CLP List PCI functions and via the hot-plug event. * (This is basically equivalent to boot time probing on other architectures.) * In such a case the hot-plug event will be stale, but Linux still tries to add and enable the device which leads to: * a) a duplicate entry in zPCI internal device list * b) an attempt to enable the device with a stale function handle * In case b) the device will be placed in error state which makes it unusable. [Fix] * b76fee1bc56c31a9d2a49592810eba30cc06d61a b76fee1bc56c "s390/pci: ignore stale configuration request event" [Test Case] * Setup an Ubuntu Server 20.04 (focal) Linux operating system on an IBM Z or LinuxONE III LPAR. * It's now easiest to test on KVM using virtio-pci (on s390x). * Start a test virtual machine: sudo virsh start <test-guest> * Attach and hotplug a virtio-pci device: sudo virsh attach-device <test-guest> hotplug_pci_block.xml * Where hotplug_pci_block.xml looks like: <disk device="disk" type="file"> <driver name="qemu" type="raw" /> <address type="pci"> <zpci fid="4660" uid="4660" /> </address> <source file="testdisk.img" /> <target bus="virtio" dev="vdt" /> </disk> [Regression Potential] * The regression risk is moderate, since the modification is very limited and therefore manageable (additional if statement - two lines of code) and easily testable on KVM using virtio-pci. * The changes are in the zPCI event code, so in worst-case it can happen that the event handling get harmed which may break zPCI entirely, affecting all PCI devices incl. virtio-pci (on s390x). * A bug in PCI 'availability' handling also just lead to wrong states of PCI devices which make them unavailable, hence unusable. * Notice that zPCI is the s390x-specific PCI implementation, modifications here do not affect any other architecture. * And zPCI devices are less wide-spread compared to ccw devices on s390x. * On top a test kernel was build and made available for further testing atesting can be easily done with virtio-pci on KVM. [Other] * The fix/patch got upstream accepted with kernel v5.9-rc2. * But it landed already in groovy's proposed kernel 5.8 (Ubuntu-5.8.0-18.19), due to 'Groovy update: v5.8.4 upstream stable release' that is handled in LP 1893048. * Hence this fix/patch need to be applied to focal only. __________ When a PCI device (including virtio-pci for which this is easiest to test) is hot-plugged while Linux is still booting, it can be detected as an entry in CLP List PCI Functions (basically equivalent to boot time probing on other architectures) and with the hot-plug event. In this case the hot-plug event will be stale but Linux still tried to add and enable the device leading a) to a duplicate entry in zPCI internal device list b) an attempt to enable the device witha stale function handle Part b) would lead to the device being place in the error state and make it unusable. This can most easily be reproduced using KVM and doing # sudo virsh start myguest && sudo virsh attach-device myguest hotplug_pci_block.xml Where hotplug_pci_block.xml looks like the following: <disk device="disk" type="file">         <driver name="qemu" type="raw" />         <address type="pci">                 <zpci fid="4660" uid="4660" />         </address>         <source file="testdisk.img" />         <target bus="virtio" dev="vdt" /> </disk> The problem is fixed with the 3-line upstream commit b76fee1bc56c31a9d2a49592810eba30cc06d61a s390/pci: ignore stale configuration request event I also confirmed that as of the focal tag Ubuntu-5.4.0-46.50 this cherry-picks cleanly.
2020-09-16 04:01:46 Khaled El Mously linux (Ubuntu Focal): status In Progress Fix Committed
2020-09-16 05:18:06 Frank Heimes ubuntu-z-systems: status In Progress Fix Committed
2020-09-21 18:12:42 Ubuntu Kernel Bot tags architecture-s39064 bugnameltc-187974 severity-high targetmilestone-inin2004 architecture-s39064 bugnameltc-187974 severity-high targetmilestone-inin2004 verification-needed-focal
2020-09-22 08:30:53 Frank Heimes tags architecture-s39064 bugnameltc-187974 severity-high targetmilestone-inin2004 verification-needed-focal architecture-s39064 bugnameltc-187974 severity-high targetmilestone-inin2004 verification-done-focal
2020-10-13 22:38:19 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2020-10-13 22:38:19 Launchpad Janitor cve linked 2020-16119
2020-10-13 22:38:19 Launchpad Janitor cve linked 2020-16120
2020-10-14 04:29:51 Frank Heimes ubuntu-z-systems: status Fix Committed Fix Released