[UBUNTU 21.04] qemu s390x/pci: Honor vfio DMA limiting
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Skipper Bug Screeners | ||
qemu (Ubuntu) |
Fix Released
|
High
|
Canonical Server | ||
Focal |
Fix Released
|
High
|
Unassigned | ||
Groovy |
Fix Released
|
High
|
Unassigned |
Bug Description
[Impact]
* In case a vfio-pci device on s390x is under I/O load, vfio-pci device
may end up in error state.
* However, lazy unmapping in s390x can in fact cause quite a large number
of outstanding DMA requests to build up prior to being purged -
potentially the entire guest DMA space.
* This results in unexpected errors seen in qemu such as 'VFIO_MAP_DMA
failed: No space left on device'.
* The solution requires a change to both kernel and qemu.
* The qemu side of things is addressed by this SRU.
[Fix]
* A patch series that utilizes the recent kernel additions. It will check the limits and refresh mappings before being exceeded
[Test Case]
* IBM Z or LinuxONE hardware with Ubuntu Server 20.10 installed.
* PCIe adapters in place that provide vfio, like RoCE Express 2.
* A KVM host needs to be setup and a KVM guest (use again 20.10) that uses vfio.
* Generate I/O that flows through the vf and watch out for error like 'VFIO_MAP_DMA failed: No space left on device' in the log.
* We don't have all of that in place, IBM (has done on the related bug as well) will do these tests.
[Regression Potential]
* This is split in two.
- generally the reworks - albeit small - for vfio could affect all
platforms so there I'd expect issues - if any - in vfio use-cases like
device pass through
- on s390x there was more changed, but the regressions we need to look
out for would still be in the same "vfio used for pass through"
use-case area
[Other]
* The kernel portion got accepted in bug 1907421
---
Description: s390x/pci: Honor vfio DMA limiting
Symptom: vfio-pci device on s390 enters error state
Problem: Kernel commit 492855939bdb added a limit to the number of
No space left on device'
Solution: The solution requires a change to both kernel and qemu - For
DMA requests via the VFIO_IOMMU_GET_INFO ioctl and then ensure
that the guest is told to refresh mappings before exceeding
the vfio limit.
Reproduction: Put a vfio-pci device on s390 under I/O load
This QEMU issue is related to the kernel issue in launchpad bug #1907421. Backport patches have been attached for a subset of the required patches for this fix... The backports required boiled down to 3 major reasons:
1) For the header sync, I suspect you only want the minimal set of changes needed
2) There is a missing upstream commit (408b55db8be3) that re-organizes the location of 2 s390-pci header files, causing conflicts
3) Adjustments had to be made due to the QEMU build system change (meson)
I initially performed the backport against 4.2/focal-devel; the same patches and process will also apply cleanly to 5.0/groovy-devel. There should be nothing required for hirsute as everything is already in upstream QEMU 5.2.
In summary:
53ba2eee52bf: Backport as patch 0001. Rather than doing a full header sync, update ONLY the header change needed for the DMA fix. See attached patch 0001.
3ab7a0b40d4b: cherry-pick works
7486a62845b1: cherry-pick works
cd7498d07fbb: Backport as patch 0004. This upstream commit added a new part using meson, which does not exist in 5.0.
37fa32de7073: Backport as patch 0005. This was mainly due to conflicts with a missing patch that relocated some include files.
77280d33bc9c: Backport as patch 0006. This was due to different build system + CONFIG_DEVICES doesn't exist.
As such, I have attached patches 0001, 0004, 0005 and 0006. Please cherry pick for patches 0002 and 0003.
To verify, I applied the patches provided and cherry-picks against both focal-devel and groovy-devel. In each case, for the host system I used the groovy kernel Frank provided in launchpad bug #1907421 which includes the kernel portion of this fix -- using these together, I verified that the DMA limit is being read in and honored appropriately by QEMU, and I can no longer trigger an overrun of the DMA space when a guest pushes heavy data transfer via PCI (no errors in log, no transfer stalls).
Also, as related to the last patch of the set, I further verified that no build errors are encountered when configured with --without-
Related branches
- Robie Basak: Approve (sru)
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 745 lines (+693/-0)8 files modifieddebian/changelog (+6/-0)
debian/patches/series (+6/-0)
debian/patches/ubuntu/lp-1913395-1-linux-headers-update-against-5.10-rc1.patch (+45/-0)
debian/patches/ubuntu/lp-1913395-2-vfio-Create-shared-routine-for-scanning-info-capabil.patch (+64/-0)
debian/patches/ubuntu/lp-1913395-3-vfio-Find-DMA-available-capability.patch (+75/-0)
debian/patches/ubuntu/lp-1913395-4-0004-s390x-pci-Add-routine-to-get-the-vfio-dma-available.patch (+120/-0)
debian/patches/ubuntu/lp-1913395-5-s390x-pci-Honor-DMA-limits-set-by-vfio.patch (+324/-0)
debian/patches/ubuntu/lp-1913395-6-s390x-fix-build-for-without-default-devices.patch (+53/-0)
- Robie Basak: Approve (sru)
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 745 lines (+693/-0)8 files modifieddebian/changelog (+6/-0)
debian/patches/series (+6/-0)
debian/patches/ubuntu/lp-1913395-1-linux-headers-update-against-5.10-rc1.patch (+45/-0)
debian/patches/ubuntu/lp-1913395-2-vfio-Create-shared-routine-for-scanning-info-capabil.patch (+64/-0)
debian/patches/ubuntu/lp-1913395-3-vfio-Find-DMA-available-capability.patch (+75/-0)
debian/patches/ubuntu/lp-1913395-4-0004-s390x-pci-Add-routine-to-get-the-vfio-dma-available.patch (+120/-0)
debian/patches/ubuntu/lp-1913395-5-s390x-pci-Honor-DMA-limits-set-by-vfio.patch (+324/-0)
debian/patches/ubuntu/lp-1913395-6-s390x-fix-build-for-without-default-devices.patch (+53/-0)
CVE References
Changed in ubuntu-z-systems: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
Changed in linux (Ubuntu): | |
assignee: | Skipper Bug Screeners (skipper-screen-team) → nobody |
Changed in ubuntu-z-systems: | |
importance: | Undecided → High |
Changed in qemu (Ubuntu): | |
status: | New → Fix Released |
Changed in qemu (Ubuntu Focal): | |
status: | New → Triaged |
Changed in qemu (Ubuntu Groovy): | |
status: | New → Triaged |
Changed in ubuntu-z-systems: | |
status: | New → Triaged |
tags: | added: server-next |
Changed in ubuntu-z-systems: | |
status: | Triaged → In Progress |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
tags: |
added: targetmilestone-inin2104 removed: targetmilestone-inin--- |
Changed in qemu (Ubuntu): | |
importance: | Undecided → High |
Changed in qemu (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in qemu (Ubuntu Groovy): | |
importance: | Undecided → High |
Changed in ubuntu-z-systems: | |
status: | Fix Committed → Fix Released |
Default Comment by Bridge