aws: improve hibernation reliability in groovy

Bug #1902926 reported by Andrea Righi
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-aws (Ubuntu)
Fix Released
Fix Released

Bug Description


The 5.8 kernel (groovy) is not still 100% reliable with hibernation, especially on the Xen instance type. However, re-aligning the config options as much as possible with the 5.4 kernel (focal) allows to hibernate and resume with a success rate close to 100% (according to our tests on the AWS cloud).

[Test case]

 - spin up an AWS instance (for example a c4.8xlarge instance type)
 - run a memory stress test
 - hibernate
 - resume
 - verify that the system has been resumed correctly and the memory stress test is still running


The following fixes must be applied to improve hibernation/resume

 - disable CONFIG_SUSPEND (suspend to memory is not supported in AWS and
   it can introduce a deadlock condition with the Xen hibernation layer)
 - make sure CONFIG_DMA_CMA is disabled (this introduces another
   deadlock condition with Xen)
 - disable CONFIG_FB_HYPERV: this would enable CONFIG_DMA_CMA and we
   dont' want that (to prevent the Xen deadlock); moreover this driver
   is not needed at all in the AWS environment
 - disable CONFIG_XEN_FBDEV_FRONTEND: this is also not required in the
   AWS environement and it has the potential of breaking hibernation
 - compile xen-netfront as module
 - remove all sound-related modules (not really needed in the AWS cloud,
   they were also disabled on 5.4)

[Regression potential]

These are all .config changes. With these changes we are basically
re-aligning with the previous 5.4 settings and we have experimentally
verified (tests) that these changes are increasing the success rate of
hibernation. So the regression potential is minimal.

CVE References

Andrea Righi (arighi)
description: updated
Changed in linux-aws (Ubuntu Groovy):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (52.4 KiB)

This bug was fixed in the package linux-aws - 5.8.0-1014.15

linux-aws (5.8.0-1014.15) groovy; urgency=medium

  * groovy/linux-aws: 5.8.0-1014.15 -proposed tracker (LP: #1903182)

  * Groovy update: v5.8.15 upstream stable release (LP: #1902130)
    - [Packaging] [aws] module ocelot_board rename

  * AWS: add the nitro_enclaves driver (LP: #1903087)
    - [Config][aws] update config for NITRO_ENCLAVES
    - nitro_enclaves: Add ioctl interface definition
    - nitro_enclaves: Define the PCI device interface
    - nitro_enclaves: Define enclave info for internal bookkeeping
    - nitro_enclaves: Init PCI device driver
    - nitro_enclaves: Handle PCI device command requests
    - nitro_enclaves: Handle out-of-band PCI device events
    - nitro_enclaves: Init misc device providing the ioctl interface
    - nitro_enclaves: Add logic for creating an enclave VM
    - nitro_enclaves: Add logic for setting an enclave vCPU
    - nitro_enclaves: Add logic for getting the enclave image load info
    - nitro_enclaves: Add logic for setting an enclave memory region
    - nitro_enclaves: Add logic for starting an enclave
    - nitro_enclaves: Add logic for terminating an enclave
    - nitro_enclaves: Add Kconfig for the Nitro Enclaves driver
    - nitro_enclaves: Add Makefile for the Nitro Enclaves driver
    - nitro_enclaves: Add sample for ioctl interface usage
    - nitro_enclaves: Add overview documentation
    - MAINTAINERS: Add entry for the Nitro Enclaves driver

  * aws: improve hibernation reliability in groovy (LP: #1902926)
    - [Config] [aws] disable CONFIG_INPUT_XEN_KBDDEV_FRONTEND
    - [Config] [aws] disable CONFIG_XEN_BALLOON on amd64
    - [Config] [aws] enforce CONFIG_XEN_NETDEV_FRONTEND
    - [Config] [aws] remove all sound-related modules

  * xen hibernation support for linux-aws (LP: #1732512)
    - [Config] [aws] make sure CONFIG_SUSPEND is disabled
    - [Config] [aws] disable CONFIG_XEN_FBDEV_FRONTEND

  * aws: disable CONFIG_DMA_CMA (LP: #1879711)
    - [Config] [aws] make sure CONFIG_FB_HYPERV is disabled

  * aws: update patch to batch hibernate and resume IO requests (LP: #1902864)
    - Revert "UBUNTU: SAUCE: [aws] PM / hibernate: Speed up hibernation by
      batching requests"
    - PM: hibernate: Batch hibernate and resume IO requests

  * aws: disable strict IOMMU TLB invalidation by default (LP: #1902281)
    - SAUCE: [aws] iommu: set the default iommu-dma mode as non-strict

  [ Ubuntu: 5.8.0-30.32 ]

  * groovy/linux: 5.8.0-30.32 -proposed tracker (LP: #1903194)
  * Update kernel packaging to support forward porting kernels (LP: #1902957)
    - [Debian] Update for leader included in BACKPORT_SUFFIX
  * Avoid double newline when running insertchanges (LP: #1903293)
    - [Packaging] insertchanges: avoid double newline
  * EFI: Fails when BootCurrent entry does not exist (LP: #1899993)
    - efivarfs: Replace invalid slashes with exclamation marks in dentries.
  * raid10: Block discard is very slow, causing severe delays for mkfs and
    fstrim operations (LP: #1896578)
    - md: add md_submit_discard_bio() for submitting discard bio
    - md/raid10: extend r10bio devs to raid disks
    - md/raid10: pull...

Changed in linux-aws (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-aws - 5.8.0-1018.20+21.04.1

linux-aws (5.8.0-1018.20+21.04.1) hirsute; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  [ Ubuntu: 5.8.0-1018.20 ]

  * debian/scripts/file-downloader does not handle positive failures correctly
    (LP: #1878897)
    - [Packaging] file-downloader not handling positive failures correctly
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * CVE-2021-1052 // CVE-2021-1053
    - [Packaging] NVIDIA -- Add the NVIDIA 460 driver

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 07 Jan 2021 10:47:22 -0300

Changed in linux-aws (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers