Activity log for bug #1606940

Date Who What changed Old value New value Message
2016-07-27 14:08:00 Ryan Harper bug added bug
2016-07-27 14:08:23 Ryan Harper nominated for series Ubuntu Trusty
2016-07-27 14:09:26 Ryan Harper qemu (Ubuntu): status New Fix Released
2016-07-27 14:12:08 Ryan Harper bug added subscriber Mark W Wenning
2016-07-27 14:12:16 Ryan Harper bug added subscriber Jon Grimm
2016-07-27 14:14:55 Ryan Harper attachment added qemu-bug-1606940.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1606940/+attachment/4708134/+files/qemu-bug-1606940.debdiff
2016-07-27 14:15:23 Ryan Harper description [Impact] * Users of SRIOV devices in qemu on Trusty may encounter unstable behavior on pass-through PCI devices due to a bug in qemu's MMIO mapping to overlapping ram slots. When memory is accessed in subpage granularity where slots have overlapping regions multiple invocations of the handler ocurrs which resulted in multiple pci writes. This affects the qemu releases prior to qemu 2.5, it has been fixed in newer releases. * Backporting fixes from upstream release is required to allow certain PCI devices under SRIOV to function properly. * All patches applied are already accepted upstream. Xenial, Yakkety are OK, Wily -> Trusty are affected. [Test Case] * On a Trusty 14.04 system with affected SRIOV device. - boot system with sriov enabled - launch vm with sriov device passed through using guest XML attached (bug-1563375-trusty-guest.xml) - unpack pcimem tarball inside vm (pcimem.tar attached) - Read (note the pci path should point to the SRIOV device) ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d - Write ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048 - Read again ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d The value of 0x10080 should be the same for the first read and the second read, after the write. If the bug is hit, the second read will report a value of double instead of the same. [Regression Potential] * SR-IOV device drivers may have unknowingly relied on KVM multi-write behavior prior to this patch; that's highly unlikely since it would fail on physical hardware (which does not produce this effect). But there is a chance that devices only passed into the guest via SRIOV might break. [Original Description] Customer engineers are testing the SR-IOV feature with a new network card on x86 servers and ran into the issue described below. They are *not* seeing this issue on Intel 82599 NIC. We are testing a new device in EP mode with SRIOV. With a CentOS7 VM running on the Ubuntu 14.04.2 host (using VFIO) we see that a single PCI read or write transaction targeting the device’s BAR0 issued from the VM appears twice on the PCIe bus. The same accesses work fine when the VF is accessed directly from the Ubuntu 14.04.2 host. These BAR0 PCI accesses do not require a driver on the VM side. We can reproduce the problem using a simple user-space application to access the VF’s BAR0 registers. We do not see this problem when the VM runs within a CentOS 7 host or under a Ubuntu 12.04 host. This appears specific to Ubuntu 14.04 release. Appreciate your help in any clues or pointers to this behavior. This issue is also not happening with 16.04 beta. Steps to reproduce the bug with pcimem: Read: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d Write: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048 Read again: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d The value of 0x10080 should be the same for the first read and the second read, after the write. If the bug is hit, the second read will report a value of double instead of the same. The register should have read back the same value that was written. The register acts like an adder in that every write adds to the previously written value minus anything the device has consumed. We see that the second read returns double the value written in the single write. We captured a PCIe trace and found that each of the PCI operation accessing this register is seen twice on the PCI bus. The 2 writes cause the register value to double which has implications for normal operation. The PCIe trace is attached and has markers to identify the relevant transactions. [Impact]  * Users of SRIOV devices in qemu on Trusty may encounter unstable    behavior on pass-through PCI devices due to a bug in qemu's MMIO    mapping to overlapping ram slots. When memory is accessed in    subpage granularity where slots have overlapping regions multiple    invocations of the handler ocurrs which resulted in multiple pci    writes.    This affects the qemu releases prior to qemu 2.5, it has been fixed in    newer releases.  * Backporting fixes from upstream release is required to allow    certain PCI devices under SRIOV to function properly.  * All patches applied are already accepted upstream. Xenial, Yakkety    are OK, Wily -> Trusty are affected. [Test Case]  * On a Trusty 14.04 system with affected SRIOV device.     - boot system with sriov enabled     - launch vm with sriov device passed through       using guest XML attached (bug-1606940-trusty-guest.xml)     - unpack pcimem tarball inside vm (pcimem.tar attached)     - Read (note the pci path should point to the SRIOV device)      ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d     - Write      ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048     - Read again      ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d     The value of 0x10080 should be the same for the first read     and the second read, after the write.     If the bug is hit, the second read will report a value of double     instead of the same. [Regression Potential]  * SR-IOV device drivers may have unknowingly relied on KVM multi-write    behavior prior to this patch; that's highly unlikely since it would    fail on physical hardware (which does not produce this effect). But    there is a chance that devices only passed into the guest via SRIOV    might break. [Original Description] Customer engineers are testing the SR-IOV feature with a new network card on x86 servers and ran into the issue described below. They are *not* seeing this issue on Intel 82599 NIC. We are testing a new device in EP mode with SRIOV. With a CentOS7 VM running on the Ubuntu 14.04.2 host (using VFIO) we see that a single PCI read or write transaction targeting the device’s BAR0 issued from the VM appears twice on the PCIe bus. The same accesses work fine when the VF is accessed directly from the Ubuntu 14.04.2 host. These BAR0 PCI accesses do not require a driver on the VM side. We can reproduce the problem using a simple user-space application to access the VF’s BAR0 registers. We do not see this problem when the VM runs within a CentOS 7 host or under a Ubuntu 12.04 host. This appears specific to Ubuntu 14.04 release. Appreciate your help in any clues or pointers to this behavior. This issue is also not happening with 16.04 beta. Steps to reproduce the bug with pcimem: Read: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d Write: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048 Read again: ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d The value of 0x10080 should be the same for the first read and the second read, after the write. If the bug is hit, the second read will report a value of double instead of the same. The register should have read back the same value that was written. The register acts like an adder in that every write adds to the previously written value minus anything the device has consumed. We see that the second read returns double the value written in the single write. We captured a PCIe trace and found that each of the PCI operation accessing this register is seen twice on the PCI bus. The 2 writes cause the register value to double which has implications for normal operation. The PCIe trace is attached and has markers to identify the relevant transactions.
2016-07-27 14:17:59 Ryan Harper attachment added bug-1606940-trusty-guest.xml https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1606940/+attachment/4708135/+files/bug-1606940-trusty-guest.xml
2016-07-27 14:18:31 Ryan Harper attachment added pcimem.tar https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1606940/+attachment/4708136/+files/pcimem.tar
2016-07-27 14:20:19 Ryan Harper bug added subscriber Ubuntu Sponsors Team
2016-09-15 14:47:51 Chris J Arges bug task added qemu (Ubuntu Trusty)
2016-09-15 18:25:02 Brian Murray removed subscriber Ubuntu Sponsors Team
2016-09-15 18:25:19 Brian Murray qemu (Ubuntu Trusty): status New Fix Committed
2016-09-15 18:25:22 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2016-09-15 18:25:26 Brian Murray bug added subscriber SRU Verification
2016-09-15 18:25:35 Brian Murray tags verification-needed
2016-09-16 06:10:16 Mathew Hodson qemu (Ubuntu): importance Undecided Medium
2016-09-16 06:10:21 Mathew Hodson qemu (Ubuntu Trusty): importance Undecided Medium
2016-10-18 13:01:47 Christian Ehrhardt  attachment added collection of test logs https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1606940/+attachment/4763226/+files/bug_1606940-verification.tgz
2016-10-18 13:02:12 Christian Ehrhardt  tags verification-needed verification-done
2016-10-19 13:14:09 Launchpad Janitor qemu (Ubuntu Trusty): status Fix Committed Fix Released
2016-10-19 13:14:20 Chris J Arges removed subscriber Ubuntu Stable Release Updates Team