[linux-azure] Two Fixes For kdump Over Network
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-azure (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Committed
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
Microsoft would like to request two kdump related fixes in all releases supported on Azure. The two commits are:
c81992e7f4aa1 ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
83cc3508ffaa6 ("PCI: hv: Fix the PCI HyperV probe failure path to release resource properly")
These are in the virtual PCI driver for Hyper-V. The customer visible symptom is that the network is not functional in the kdump kernel, so the dump file must be stored on the local disk and cannot be written over the network.
The problem only occurs when Accelerated Networking is enabled. It’s a relatively obscure scenario, which is why the problem has not surfaced before now. But we have an important customer who wants the “dump-file-
[Test Case]
- Apply requested patches and boot into updated kernel
- Verify Accelerated Networking is enabled
- Set up kdump
- configure kdump to use SSH
- Test the crash dump mechanism and verify the kernel crash dump appears on the selected remote server
Further details for setting up kdump through testing can be found here:
https:/
[Regression Potential]
Patches are only targeted to azure kernels.
Patches are desgiend to release allocated resources remaining after
error cases in hv_pci_probe() or PCI devices not being shut down
properly. if those resources are still not correctly released, then
entering D0 state in kdump kernel could continue to fail.
Potential for finding regression with freeing resources or still failing to enter D0 state in the kdump kernel even after all resources have been
released.
Build & boot tested. Verified kdump works as intended over SSH after patches are applied.
Both 5.4 and 4.15 test kernels were sent to Microsoft. Both kernels signed off on and verified to resolve problem.
CVE References
Changed in linux-azure (Ubuntu): | |
status: | New → Confirmed |
description: | updated |
description: | updated |
Changed in linux-azure (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-azure (Ubuntu Focal): | |
status: | New → Fix Committed |
The following link holds test kernels for 5.4, 5.3, and 4.15:
https:/ /kernel. ubuntu. com/~kms/ azure/lp1883261 /
5.4 was a clean apply, though 5.3 and 4.15 required some changes. Please test to verify the added patches resolve the issue.