aws: update patch to batch hibernate and resume IO requests

Bug #1902864 reported by Andrea Righi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-aws (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Andrea Righi
Groovy
Fix Released
Medium
Andrea Righi

Bug Description

[Impact]

During hibernation and resume the kernel is submitting an individual IO request for each page of data. In the aws kernel we are using a custom SAUCE patch to batch IO requests together in order to achieve better performance.

Recently a patch designed to achieve the same goal has been applied upstream. This patch has been acknowledged by Amazon and it has shown a performance improvement.

Moreover, this patch looks much cleaner compared to the custom patch that we are using and it's upstream, so it makes sense to drop the previous patch and apply this new one.

[Test case]

Hibernate + resume and measure the time required to perform these operations.

Performance result reported by Amazon:

    One hibernate and resume cycle for 16GB RAM out of 32GB in use takes
    around 21 minutes before the change, and 1 minutes after the change on
    a system with limited storage IOPS.

[Fix]

Apply the following upstream commit:

 55c4478a8f0ecedc0c1a0c9379380249985c372a ("PM: hibernate: Batch hibernate and resume IO requests")

Drop the custom aws SAUCE patch:

 11c3fa3b29722124f5c9122671983614383686db ("UBUNTU: SAUCE: [aws] PM / hibernate: Speed up hibernation by batching requests")

[Regression potential]

Upstream patch that allows to drop a custom patch that is doing the same thing. The only potential regression would be a performance drop, but according to Amazon's tests and our tests, we didn't notice any performance regression. Any other kind of regression would be considered as upstream regressions.

CVE References

Stefan Bader (smb)
Changed in linux-aws (Ubuntu Groovy):
importance: Undecided → Medium
status: New → In Progress
Changed in linux-aws (Ubuntu):
status: New → Invalid
Ian May (ian-may)
Changed in linux-aws (Ubuntu Groovy):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (52.4 KiB)

This bug was fixed in the package linux-aws - 5.8.0-1014.15

---------------
linux-aws (5.8.0-1014.15) groovy; urgency=medium

  * groovy/linux-aws: 5.8.0-1014.15 -proposed tracker (LP: #1903182)

  * Groovy update: v5.8.15 upstream stable release (LP: #1902130)
    - [Packaging] [aws] module ocelot_board rename

  * AWS: add the nitro_enclaves driver (LP: #1903087)
    - [Config][aws] update config for NITRO_ENCLAVES
    - nitro_enclaves: Add ioctl interface definition
    - nitro_enclaves: Define the PCI device interface
    - nitro_enclaves: Define enclave info for internal bookkeeping
    - nitro_enclaves: Init PCI device driver
    - nitro_enclaves: Handle PCI device command requests
    - nitro_enclaves: Handle out-of-band PCI device events
    - nitro_enclaves: Init misc device providing the ioctl interface
    - nitro_enclaves: Add logic for creating an enclave VM
    - nitro_enclaves: Add logic for setting an enclave vCPU
    - nitro_enclaves: Add logic for getting the enclave image load info
    - nitro_enclaves: Add logic for setting an enclave memory region
    - nitro_enclaves: Add logic for starting an enclave
    - nitro_enclaves: Add logic for terminating an enclave
    - nitro_enclaves: Add Kconfig for the Nitro Enclaves driver
    - nitro_enclaves: Add Makefile for the Nitro Enclaves driver
    - nitro_enclaves: Add sample for ioctl interface usage
    - nitro_enclaves: Add overview documentation
    - MAINTAINERS: Add entry for the Nitro Enclaves driver

  * aws: improve hibernation reliability in groovy (LP: #1902926)
    - [Config] [aws] disable CONFIG_INPUT_XEN_KBDDEV_FRONTEND
    - [Config] [aws] disable CONFIG_XEN_BALLOON on amd64
    - [Config] [aws] enforce CONFIG_XEN_NETDEV_FRONTEND
    - [Config] [aws] remove all sound-related modules

  * xen hibernation support for linux-aws (LP: #1732512)
    - [Config] [aws] make sure CONFIG_SUSPEND is disabled
    - [Config] [aws] disable CONFIG_XEN_FBDEV_FRONTEND

  * aws: disable CONFIG_DMA_CMA (LP: #1879711)
    - [Config] [aws] make sure CONFIG_FB_HYPERV is disabled

  * aws: update patch to batch hibernate and resume IO requests (LP: #1902864)
    - Revert "UBUNTU: SAUCE: [aws] PM / hibernate: Speed up hibernation by
      batching requests"
    - PM: hibernate: Batch hibernate and resume IO requests

  * aws: disable strict IOMMU TLB invalidation by default (LP: #1902281)
    - SAUCE: [aws] iommu: set the default iommu-dma mode as non-strict

  [ Ubuntu: 5.8.0-30.32 ]

  * groovy/linux: 5.8.0-30.32 -proposed tracker (LP: #1903194)
  * Update kernel packaging to support forward porting kernels (LP: #1902957)
    - [Debian] Update for leader included in BACKPORT_SUFFIX
  * Avoid double newline when running insertchanges (LP: #1903293)
    - [Packaging] insertchanges: avoid double newline
  * EFI: Fails when BootCurrent entry does not exist (LP: #1899993)
    - efivarfs: Replace invalid slashes with exclamation marks in dentries.
  * raid10: Block discard is very slow, causing severe delays for mkfs and
    fstrim operations (LP: #1896578)
    - md: add md_submit_discard_bio() for submitting discard bio
    - md/raid10: extend r10bio devs to raid disks
    - md/raid10: pull...

Changed in linux-aws (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-aws - 5.8.0-1018.20+21.04.1

---------------
linux-aws (5.8.0-1018.20+21.04.1) hirsute; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  [ Ubuntu: 5.8.0-1018.20 ]

  * debian/scripts/file-downloader does not handle positive failures correctly
    (LP: #1878897)
    - [Packaging] file-downloader not handling positive failures correctly
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * CVE-2021-1052 // CVE-2021-1053
    - [Packaging] NVIDIA -- Add the NVIDIA 460 driver

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 07 Jan 2021 10:47:22 -0300

Changed in linux-aws (Ubuntu):
status: Invalid → Fix Released
Andrea Righi (arighi)
Changed in linux-aws (Ubuntu Focal):
importance: Undecided → Medium
assignee: nobody → Andrea Righi (arighi)
Andrea Righi (arighi)
Changed in linux-aws (Ubuntu Groovy):
assignee: nobody → Andrea Righi (arighi)
Stefan Bader (smb)
Changed in linux-aws (Ubuntu Focal):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (21.8 KiB)

This bug was fixed in the package linux-aws - 5.4.0-1043.45

---------------
linux-aws (5.4.0-1043.45) focal; urgency=medium

  * focal/linux-aws: 5.4.0-1043.45 -proposed tracker (LP: #1923247)

  * linux-aws 5.4.0-1042.44 has incorrect DKMS versions (LP: #1923245)
    - [Packaging] Fix incorrect DKMS versions

linux-aws (5.4.0-1042.44) focal; urgency=medium

  * focal/linux-aws: 5.4.0-1042.44 -proposed tracker (LP: #1921016)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * Enforce CONFIG_DRM_BOCHS=m (LP: #1916290)
    - [Config] aws: Enforce CONFIG_DRM_BOCHS=m

  * aws: fix hibernation issues on c5.18xlarge (LP: #1918694)
    - SAUCE: aws: kvm: double the size of hv_clock_boot

  * aws: update Xen hibernation patch set (LP: #1913410)
    - Revert "UBUNTU: SAUCE: xen-netfront: prevent unnecessary close on hibernate"
    - Revert "UBUNTU: SAUCE: xen: Update sched clock offset to avoid system
      instability in hibernation"
    - Revert "UBUNTU: SAUCE: xen: Introduce wrapper for save/restore sched clock
      offset"
    - Revert "UBUNTU: SAUCE: x86/xen: save and restore steal clock"
    - Revert "UBUNTU: SAUCE: xen/time: introduce xen_{save,restore}_steal_clock"
    - Revert "UBUNTU: SAUCE: xen-netfront: add callbacks for PM suspend and
      hibernation"
    - Revert "UBUNTU: SAUCE: xen-blkfront: add callbacks for PM suspend and
      hibernation"
    - Revert "UBUNTU: SAUCE: genirq: Shutdown irq chips in suspend/resume during
      hibernation"
    - Revert "UBUNTU: SAUCE: x86/xen: add system core suspend and resume
      callbacks"
    - Revert "UBUNTU: SAUCE: x86/xen: Introduce new function to map
      HYPERVISOR_shared_info on Resume"
    - Revert "UBUNTU: SAUCE: xenbus: add freeze/thaw/restore callbacks support"
    - Revert "UBUNTU: SAUCE: xen/manage: keep track of the on-going suspend mode"
    - SAUCE: xen/manage: keep track of the on-going suspend mode
    - SAUCE: xen/manage: introduce helper function to know the on-going suspend
      mode
    - SAUCE: xenbus: add freeze/thaw/restore callbacks support
    - SAUCE: x86/xen: Introduce new function to map HYPERVISOR_shared_info on
      Resume
    - SAUCE: x86/xen: add system core suspend and resume callbacks
    - SAUCE: xen-blkfront: add callbacks for PM suspend and hibernation
    - SAUCE: xen-netfront: add callbacks for PM suspend and hibernation support
    - SAUCE: xen/time: introduce xen_{save,restore}_steal_clock
    - SAUCE: x86/xen: save and restore steal clock
    - SAUCE: xen/events: add xen_shutdown_pirqs helper function
    - SAUCE: x86/xen: close event channels for PIRQs in system core suspend
      callback
    - SAUCE: xen-blkfront: add 'persistent_grants' parameter
    - SAUCE: Revert "xen: dont fiddle with event channel masking in
      suspend/resume"
    - SAUCE: xen-blkfront: Fixed blkfront_restore to remove a call to negotiate_mq
    - SAUCE: block: xen-blkfront: consider new dom0 features on restore
    - SAUCE: xen: restore pirqs on resume from hibernation.
    - SAUCE: xen: Only restore the ACPI SCI interrupt in xen_restore_pirqs.
    - SAUCE: xen-netfront: call netif_device_attach on resume
    - SAUCE: xen: Restore xen-pirqs o...

Changed in linux-aws (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.