[Ubuntu 20.10] zPCI DMA tables and bitmap leak on hard unplug (PCI Event 0x0304)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Skipper Bug Screeners | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Skipper Bug Screeners | ||
Focal |
Fix Released
|
Medium
|
Unassigned | ||
Groovy |
Fix Released
|
Undecided
|
Skipper Bug Screeners |
Bug Description
SRU Justification:
==================
[Impact]
* Since zpci_dma_
* This is because commit "s390/pci: adapt events for zbus" removed the zpci_disable_
* It is only not called on hot unplug with event type PEC 0x0304 - this is the one where Linux is informed the device is
gone instead of being asked to deconfigure it.
* It should also always leak them with that event type on an enabled device.
[Fix]
* afdf9550e54627f
[Test Case]
* Have an IBM Z LPAR, that has PCIe devices (like RoCE adapters) assigned and Ubuntu Server 20.04 installed.
* Disable and re-enable one (or more) of the assigned PCIe cards (using hotplug) - on LPAR this can be triggered using the 'Reassign I/O Path' function at the HMC/SE.
* Monitor DMA tables and bitmaps for any kind of leaking.
* Since these tables are vmalloc-ed memory, it's sufficient to monitor via /proc/meminfo and see that reassigning back and forth of a device will have the memory usage grow continuously.
* The test and verification needs to be conducted by IBM.
[Regression Potential]
* There regression risk can be considered as moderate, because:
* only a call of zpci_disable_
* Since __zpci_
* which could cause issues regarding the availability of zPCI devices
* and in worst case make zPCI devices unusable.
* But only one switch case of the function is modified and all cases break, so only PEC 0x0304 should be affected.
* And the code changes themselves are minimal, and the zPCI code is limited to the s390x architecture.
* On top test kernels were built and shared for further testing.
[Other]
* Since this commit needs to land in groovy too, but groovy is still in development (hence the SRU process does not apply for groovy yet), I've sent a separate Patch request for groovy.
__________
Commit "s390/pci: adapt events for zbus" removed the zpci_disable_
This however skips the Linux side of the disable in particular it leads
to leaking the DMA tables and bitmaps because zpci_dma_
never called on the device.
This has been fixed in the following commit (currently in linux-next)
afdf9550e54627f
The commit re-introduces the zpci_disable_
to call out the fact that it may be called with the device disabled
already.
As the commit was introduced with the multi-function support
this of course should go into both 20.10 and 20.04.
CVE References
tags: | added: architecture-s39064 bugnameltc-188134 severity-high targetmilestone-inin2010 |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → linux (Ubuntu) |
Changed in ubuntu-z-systems: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
importance: | Undecided → High |
description: | updated |
Changed in linux (Ubuntu Focal): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | Fix Committed → Fix Released |
------- Comment From <email address hidden> 2020-09-18 06:41 EDT-------
Note I just checked and this cherry-picks cleanly on focal/master-next.