[linux-azure] Fix hibernation in case interrupts are not re-created
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-azure (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Microsoft would like to request the following commit in all supported releases that run on Azure:
915cff7f38c5 (“PCI: hv: Fix hibernation in case interrupts are not re-created”)
Commit details:
pci_restore_
via MMIO. On a physical machine, this works perfectly; for a Linux VM
running on a hypervisor, which typically enables IOMMU interrupt remapping,
the hypervisor usually should trap and emulate the MMIO accesses in order
to re-create the necessary interrupt remapping table entries in the IOMMU,
otherwise the interrupts can not work in the VM after hibernation.
Hyper-V is different from other hypervisors in that it does not trap and
emulate the MMIO accesses, and instead it uses a para-virtualized method,
which requires the VM to call hv_compose_
of the info that would be passed to the hypervisor in the case of the
trap-
drivers, which destroy and re-create the interrupts across hibernation, so
hv_
drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia
proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts
across hibernation, so hv_pci_resume() has to call hv_compose_
otherwise the PCI device drivers can no longer receive interrupts after
the VM resumes from hibernation.
Hyper-V is also different in that chip->irq_unmask() may fail in a
Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask()
can not fail because unmasking an MSI/MSI-X register just means an MMIO
write): during hibernation, when a CPU is offlined, the kernel tries
to move the interrupt to the remaining CPUs that haven't been offlined
yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails
because the vmbus channel has been closed: here the early "return" in
hv_irq_unmask() means the pci_msi_
desc->masked remains "true", so later after hibernation, the MSI interrupt
always remains masked, which is incorrect. Refer to cpu_disable_
-> fixup_irqs() -> irq_migrate_
summary: |
- Fix hibernation in case interrupts are not re-created + [linux-azure] Fix hibernation in case interrupts are not re-created |