aws: improve hibernation reliability in groovy
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-aws (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Groovy |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
The 5.8 kernel (groovy) is not still 100% reliable with hibernation, especially on the Xen instance type. However, re-aligning the config options as much as possible with the 5.4 kernel (focal) allows to hibernate and resume with a success rate close to 100% (according to our tests on the AWS cloud).
[Test case]
- spin up an AWS instance (for example a c4.8xlarge instance type)
- run a memory stress test
- hibernate
- resume
- verify that the system has been resumed correctly and the memory stress test is still running
[Fix]
The following fixes must be applied to improve hibernation/resume
reliability:
- disable CONFIG_SUSPEND (suspend to memory is not supported in AWS and
it can introduce a deadlock condition with the Xen hibernation layer)
- make sure CONFIG_DMA_CMA is disabled (this introduces another
deadlock condition with Xen)
- disable CONFIG_FB_HYPERV: this would enable CONFIG_DMA_CMA and we
dont' want that (to prevent the Xen deadlock); moreover this driver
is not needed at all in the AWS environment
- disable CONFIG_
AWS environement and it has the potential of breaking hibernation
- disable CONFIG_
- compile xen-netfront as module
- disable CONFIG_XEN_BALLOON
- remove all sound-related modules (not really needed in the AWS cloud,
they were also disabled on 5.4)
[Regression potential]
These are all .config changes. With these changes we are basically
re-aligning with the previous 5.4 settings and we have experimentally
verified (tests) that these changes are increasing the success rate of
hibernation. So the regression potential is minimal.
description: | updated |
Changed in linux-aws (Ubuntu Groovy): | |
status: | New → Fix Committed |
This bug was fixed in the package linux-aws - 5.8.0-1014.15
---------------
linux-aws (5.8.0-1014.15) groovy; urgency=medium
* groovy/linux-aws: 5.8.0-1014.15 -proposed tracker (LP: #1903182)
* Groovy update: v5.8.15 upstream stable release (LP: #1902130)
- [Packaging] [aws] module ocelot_board rename
* AWS: add the nitro_enclaves driver (LP: #1903087)
- [Config][aws] update config for NITRO_ENCLAVES
- nitro_enclaves: Add ioctl interface definition
- nitro_enclaves: Define the PCI device interface
- nitro_enclaves: Define enclave info for internal bookkeeping
- nitro_enclaves: Init PCI device driver
- nitro_enclaves: Handle PCI device command requests
- nitro_enclaves: Handle out-of-band PCI device events
- nitro_enclaves: Init misc device providing the ioctl interface
- nitro_enclaves: Add logic for creating an enclave VM
- nitro_enclaves: Add logic for setting an enclave vCPU
- nitro_enclaves: Add logic for getting the enclave image load info
- nitro_enclaves: Add logic for setting an enclave memory region
- nitro_enclaves: Add logic for starting an enclave
- nitro_enclaves: Add logic for terminating an enclave
- nitro_enclaves: Add Kconfig for the Nitro Enclaves driver
- nitro_enclaves: Add Makefile for the Nitro Enclaves driver
- nitro_enclaves: Add sample for ioctl interface usage
- nitro_enclaves: Add overview documentation
- MAINTAINERS: Add entry for the Nitro Enclaves driver
* aws: improve hibernation reliability in groovy (LP: #1902926) INPUT_XEN_ KBDDEV_ FRONTEND XEN_NETDEV_ FRONTEND
- [Config] [aws] disable CONFIG_
- [Config] [aws] disable CONFIG_XEN_BALLOON on amd64
- [Config] [aws] enforce CONFIG_
- [Config] [aws] remove all sound-related modules
* xen hibernation support for linux-aws (LP: #1732512) XEN_FBDEV_ FRONTEND
- [Config] [aws] make sure CONFIG_SUSPEND is disabled
- [Config] [aws] disable CONFIG_
* aws: disable CONFIG_DMA_CMA (LP: #1879711)
- [Config] [aws] make sure CONFIG_FB_HYPERV is disabled
* aws: update patch to batch hibernate and resume IO requests (LP: #1902864)
- Revert "UBUNTU: SAUCE: [aws] PM / hibernate: Speed up hibernation by
batching requests"
- PM: hibernate: Batch hibernate and resume IO requests
* aws: disable strict IOMMU TLB invalidation by default (LP: #1902281)
- SAUCE: [aws] iommu: set the default iommu-dma mode as non-strict
[ Ubuntu: 5.8.0-30.32 ]
* groovy/linux: 5.8.0-30.32 -proposed tracker (LP: #1903194) discard_ bio() for submitting discard bio
* Update kernel packaging to support forward porting kernels (LP: #1902957)
- [Debian] Update for leader included in BACKPORT_SUFFIX
* Avoid double newline when running insertchanges (LP: #1903293)
- [Packaging] insertchanges: avoid double newline
* EFI: Fails when BootCurrent entry does not exist (LP: #1899993)
- efivarfs: Replace invalid slashes with exclamation marks in dentries.
* raid10: Block discard is very slow, causing severe delays for mkfs and
fstrim operations (LP: #1896578)
- md: add md_submit_
- md/raid10: extend r10bio devs to raid disks
- md/raid10: pull...