This issue no longer exists in Ubuntu. Kernel bisection shows that it impacted upstream kernels between v4.20 and v5.3.
Bisection was a little complicated because there are 2 overlapping issues. There's the reboot hang, but there's also an issue that causes the host mellanox driver to crash when you passthrough a VF. So I bisected the mellanox driver crash first, then manually applied that fix while biscting the reboot hang.
Here's a chronological set of the relevant commits (annotation will
require a fixed-width font):
v4.19
975bb8b4dc93 PCI/IOV: Use VF0 cached config space size for other VFs ------------------+
v4.20-rc1 |
b61d271e59d7 iommu/dma: Move domain lookup into __iommu_dma_{map,unmap} --------+ +---- Reboot hangs
76bf6a8634a1 Revert "PCI/IOV: Use VF0 cached config space size for other VFs" --|------+
v5.3-rc1 +--- Passthrough crashes
8af23fad6261 iommu/dma: Handle MSI mappings separately -------------------------+
v5.3-rc5
As you can see, the reboot hang problem started in upstream v4.20-rc1,
and was fixed in v5.3-rc1. So 4.15 was not impacted, and all 5.4
kernels already have the fix.
This issue no longer exists in Ubuntu. Kernel bisection shows that it impacted upstream kernels between v4.20 and v5.3.
Bisection was a little complicated because there are 2 overlapping issues. There's the reboot hang, but there's also an issue that causes the host mellanox driver to crash when you passthrough a VF. So I bisected the mellanox driver crash first, then manually applied that fix while biscting the reboot hang.
Here's a chronological set of the relevant commits (annotation will
require a fixed-width font):
v4.19 dma_{map, unmap} --------+ +---- Reboot hangs ------- ------- ----+
975bb8b4dc93 PCI/IOV: Use VF0 cached config space size for other VFs ------------------+
v4.20-rc1 |
b61d271e59d7 iommu/dma: Move domain lookup into __iommu_
76bf6a8634a1 Revert "PCI/IOV: Use VF0 cached config space size for other VFs" --|------+
v5.3-rc1 +--- Passthrough crashes
8af23fad6261 iommu/dma: Handle MSI mappings separately -------
v5.3-rc5
As you can see, the reboot hang problem started in upstream v4.20-rc1,
and was fixed in v5.3-rc1. So 4.15 was not impacted, and all 5.4
kernels already have the fix.