Hibernation fails when an additional swapfile is added due to priority mismatch

Bug #1968805 reported by Matthew Ruffell
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ec2-hibinit-agent (Ubuntu)
Status tracked in Kinetic
Bionic
Fix Released
Medium
Matthew Ruffell
Focal
Fix Released
Medium
Matthew Ruffell
Impish
Fix Released
Medium
Matthew Ruffell
Jammy
Fix Released
Medium
Matthew Ruffell
Kinetic
Fix Released
Medium
Matthew Ruffell

Bug Description

[Impact]

It is not uncommon for users to add a swapfile to their AWS instance, in case they run short of memory. For users that optionally enable Hibernation support, the swapfile generated by ec2-hibinit-agent, /swap-hibinit, needs to always be the highest priority when it comes to suspend the system, since ec2-hibinit-agent sets up /swap-hibinit to be the correct way to suspend and resume via the resume=UUID=<uuid> and resume_offeset=<offset> kernel command line parameters.

ec2-hibinit-agent keeps /swap-hibinit swapoff during normal instance use, and right before Hibernation occurs, /etc/acpi/actions/sleep.sh swapon /swap-hibinit, and calls systemctl hibernate:

do_hibernate() {
    if [ -d /run/systemd/system ]; then
        systemctl hibernate

case "$2" in
    SBTN)
        swapon /swap-hibinit && do_hibernate

Something changed between 18.04 and 20.04, such that new swapfiles are added with a lower priority than the previous swapfile when they are swapon:

On Focal and later, we see behaviour like if we simply swapon /swap-hibinit generated by ec2-hibinit-agent, we
see it is -2:

$ sudo swapon /swap-hibinit
$ swapon --show
NAME TYPE SIZE USED PRIO
/swap-hibinit file 3.9G 0B -2

Turning it off:
$ sudo swapoff /swap-hibinit
$ swapon --show
NAME TYPE SIZE USED PRIO

Lets add /swapfile in:

$ sudo swapon /swapfile
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

Now we enable /swap-hibinit again, and see it is -3:

$ sudo swapon /swap-hibinit
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2
/swap-hibinit file 3.9G 0B -3

Lets add in another swapfile, /swapfile-second, and we see -2, -3, -4:

$ sudo swapon /swapfile-second
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2
/swap-hibinit file 3.9G 0B -3
/swapfile-second file 4G 0B -4

What happens is that if we have a swapfile, say, /swapfile at default priority -2, when we go to hibernate, the swapon in /etc/acpi/actions/sleep.sh will set the priority of /swap-hibinit to -3. systemd / the kernel will then select the highest priority swapfile to hibernate to, in this case /swapfile, which is NOT setup for resume= or resume_offset= on the kernel command line, and hibernation will fail.

Apr 11 21:08:15 ip-172-31-84-225 kernel: [ 240.990073] Adding 4095996k swap on /swap-hibinit. Priority:-3 extents:6 across:4644860k SSFS

This leaves the instance in the "Stopping" state on the EC2 console until it hits the 20 minute timeout, at which point it is force stopped.

The fix is to set the priority when we swapon /swap-hibinit to something higher than any other swapfile, to ensure we hibernate to /swap-hibinit.

[Testcase]

From the EC2 console, select "Launch Instance".

Create a:

- t2.medium
- Ubuntu 20.04, 21.04 or 22.04
- 20gb storage space, advanced > enable encryption > yes.
- Advanced settings > Stop State (Hibernation) Support > Enabled

On boot wait for ec2-hibinit-agent to complete hibinit-agent.service, and see that /swap-hibinit is created, and swapoff.

$ ll /swap-hibinit

Add a swapfile, and switch it on:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

Go back to EC2 console, "Instance State" > "Hibernate".

You will see this in journalctl:

Mar 15 11:41:54 ip-172-31-27-108 kernel: [ 520.121761] Adding 16095656k swap on /swap-hibinit. Priority:-3 extents:13 across:17611176k SSFS
Mar 15 11:41:54 ip-172-31-27-108 root: ACPI action undefined: LNXSLPBN:00

and the instance will not hibernate. EC2 console will report "Stopping" for 20 minutes until it times out and is force stopped.

If you enable the following ppa and install the test ec2-hibinit-agent package:

https://launchpad.net/~mruffell/+archive/ubuntu/sf331069-test

Hibernation should succeed within a minute or two.

[Where problems could occur]

This change will only affect users of instances where Hibernation has been explicitly enabled, either from the EC2 instance launch advanced settings, or via the "--hibernation-options Configured=true" parameter to the "aws ec2" command. For all other users, including those with swapfiles enabled, this change will have no effect.

We are changing the /swap-hibinit file to be maximum priority right before we hibernate, to ensure it is the swapfile selected to hibernate to. Since we swapoff /swap-hibinit as soon as we resume, /swap-hibinit is used solely for hibernation, and not for regular swap space, so it is unlikely to cause any regressions to users with their own swapfiles configured with various priorities.

A potential risk is users that do not use /swap-hibinit, and use their own swapfile for hibernation, and overwrite the changes ec2-hibinit-agent makes to grub files to set the resume=UUID<uuid> and resume_offset=<offset> values. I believe such users would likely remove or purge the ec2-hibinit-agent package, since hibinit-agent.service runs at startup and re-adds the grub configuration for /swap-hibinit whether you like it or not, and having /swap-hibinit around would waste disk space that you would be paying for. Because of this, I believe that this change will not break users who hibernate to their own swapfiles, because they would have removed ec2-hibinit-agent on instance creation.

[Other info]

Chris Newcomer came across the above upstream bug, which seems to be the same issue:

https://github.com/aws/amazon-ec2-hibinit-agent/issues/20

The reporter, Ben Mares, suggests a patch to /etc/acpi/actions/sleep.sh to either read the value of a bash environment variable swap_priority, or default to 10.

https://github.com/aws/amazon-ec2-hibinit-agent/pull/21

I'm not exactly on board with the environment variable, or the default magic number of 10, as we don't know how our users are setting up swapfiles, and what priorities they set them to. I think we should instead just set the priority to the maximum, 32767 instead.

I opened a pull request, which has been reviewed by the AWS EC2 team:

https://github.com/aws/amazon-ec2-hibinit-agent/pull/22

The fix was merged upstream with:

commit a2303d269610a6e7415c5045766da605eaa7e30f
From: Matthew Ruffell <email address hidden>
Date: Wed, 20 Apr 2022 15:59:25 +1200
Subject: Swapon with maximum priority before hibernation
Link: https://github.com/aws/amazon-ec2-hibinit-agent/commit/a2303d269610a6e7415c5045766da605eaa7e30f

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: New → In Progress
Changed in ec2-hibinit-agent (Ubuntu Impish):
status: New → In Progress
Changed in ec2-hibinit-agent (Ubuntu Jammy):
status: New → In Progress
Changed in ec2-hibinit-agent (Ubuntu Focal):
importance: Undecided → Medium
Changed in ec2-hibinit-agent (Ubuntu Impish):
importance: Undecided → Medium
Changed in ec2-hibinit-agent (Ubuntu Jammy):
importance: Undecided → Medium
Changed in ec2-hibinit-agent (Ubuntu Focal):
assignee: nobody → Matthew Ruffell (mruffell)
Changed in ec2-hibinit-agent (Ubuntu Impish):
assignee: nobody → Matthew Ruffell (mruffell)
Changed in ec2-hibinit-agent (Ubuntu Jammy):
assignee: nobody → Matthew Ruffell (mruffell)
description: updated
tags: added: sts
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a debdiff for ec2-hibinit-agent on Focal which fixes this issue.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a debdiff for ec2-hibinit-agent for impish which fixes this issue.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a debdiff for ec2-hibinit-agent on Jammy which fixes this issue.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "debdiff for ec2-hibinit-agent for focal" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
Dan Streetman (ddstreet) wrote :

I have no comment on working around this by changing ec2-hibinit-agent, but is this a dup of bug 1910252 (i.e. maybe fixing that bug would solve this as well?)

I never found time to fix that bug with upstream systemd (and sru the fix) but it's probably worth doing at some point.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

I opened a pull request upstream with the same patch:

https://github.com/aws/amazon-ec2-hibinit-agent/pull/22

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a debdiff of ec2-hibinit-agent for Bionic, since it needs it too.

tags: added: sts-sponsor
Revision history for this message
Matthew Ruffell (mruffell) wrote :

The fix has now been merged upstream with:

commit a2303d269610a6e7415c5045766da605eaa7e30f
From: Matthew Ruffell <email address hidden>
Date: Wed, 20 Apr 2022 15:59:25 +1200
Subject: Swapon with maximum priority before hibernation
Link: https://github.com/aws/amazon-ec2-hibinit-agent/commit/a2303d269610a6e7415c5045766da605eaa7e30f

description: updated
Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Matthew Ruffell (mruffell)
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a V2 patch for Kinetic.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a V2 patch for Jammy

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a V2 patch for Impish.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a V2 patch for Focal.

Revision history for this message
Matthew Ruffell (mruffell) wrote :
tags: added: sts-sponsor-halves
removed: sts-sponsor
Revision history for this message
Dan Streetman (ddstreet) wrote :

uploaded to k, thanks!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu12

---------------
ec2-hibinit-agent (1.0.0-0ubuntu12) kinetic; urgency=medium

  * Swapon with maximum priority right before hibernation. This resolves
    swapfile priority issues with additional or multiple swapfiles enabled.
    (LP: #1968805)
    - d/p/lp1968805-Swapon-with-maximum-priority-before-hibernation.patch

 -- Matthew Ruffell <email address hidden> Wed, 11 May 2022 16:02:52 +1200

Changed in ec2-hibinit-agent (Ubuntu Kinetic):
status: In Progress → Fix Released
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Uploaded to the stable series. Thanks!

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Matthew, or anyone else affected,

Accepted ec2-hibinit-agent into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu11.22.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ec2-hibinit-agent (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Changed in ec2-hibinit-agent (Ubuntu Impish):
status: In Progress → Fix Committed
tags: added: verification-needed-impish
Revision history for this message
Robie Basak (racb) wrote :

Hello Matthew, or anyone else affected,

Accepted ec2-hibinit-agent into impish-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu11.21.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-impish to verification-done-impish. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-impish. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Robie Basak (racb) wrote :

Hello Matthew, or anyone else affected,

Accepted ec2-hibinit-agent into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu9.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Robie Basak (racb) wrote :

Hello Matthew, or anyone else affected,

Accepted ec2-hibinit-agent into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu4~18.04.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Robie Basak (racb) wrote :

Unsubscribing sponsors.

Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (4.2 KiB)

Performing verification for Focal.

For some reason hibernation just refused to work on any Xen based instance type, but it works fine on KVM based instances, such as c5, t3 etc. This is being looked into on bug 1968062, but I think it affects Focal as well.

Diverging from the testcase, and using c5.large (kvm) instances instead of t2.medium (xen).

I started a c5.large instance with 20gb of storage, with advanced > enable encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

This is using the current version of ec2-hibinit-agent from -updates:

$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu9.1

I went to the EC2 console and pressed Instance State > Hibernate.

The instance stopped within 30 seconds, and hibernation was successful. I started the instance again.

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

I went back to the console and pressed Instance State > Hibernate.

The follow was written to journalctl:

May 26 03:25:27 ip-172-31-26-217 systemd-logind[523]: Suspend key pressed.
May 26 03:25:27 ip-172-31-26-217 systemd-logind[523]: Requested suspend operation not supported, ignoring.
May 26 03:25:27 ip-172-31-26-217 udisksd[525]: udisks_mount_get_mount_path: assertion 'mount->type == UDISKS_MOUNT_TYPE_FILESYSTEM' failed
May 26 03:25:27 ip-172-31-26-217 kernel: Adding 4095996k swap on /swap-hibinit. Priority:-3 extents:9 across:5087228k SSFS
May 26 03:25:27 ip-172-31-26-217 root[2720]: ACPI action undefined: LNXSLPBN:00

The instance did not hibernate, and stayed running for 20 minutes, until it timed out and was force stopped.

We can see from the logs that /swap-hibinit was added at priority -3, and /swapfile is -2. /swapfile was chosen for hibernation, but as since kernel command line and grub is not set up for this, it fails.

I terminated the instance.

I then created a new instance, again a c5.large, with 20gb of storage, with Advanced > Enable Encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

I enabled -proposed, and installed ec2-hibinit-agent 1.0.0-0ubuntu9.2

Setting up ec2-hibinit-agent (1.0.0-0ubuntu9.2) ...
Installing new version of config file /etc/acpi/actions/sleep.sh ...
$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu9.2

I went to the EC2 console and pressed Instance State > Hibernate.

Again, the instance stopped within 30 seconds, and hibernation was successful.

I started the instance again, and configured a second swapfile:

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/ze...

Read more...

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (4.1 KiB)

Performing verification for Impish.

For some reason hibernation just refused to work on any Xen based instance type, but it works fine on KVM based instances, such as c5, t3 etc. This is being looked into on bug 1968062, but I think it affects Impish as well.

Diverging from the testcase, and using c5.large (kvm) instances instead of t2.medium (xen).

I started a c5.large instance with 20gb of storage, with advanced > enable encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

This is using the current version of ec2-hibinit-agent from -updates:

$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu11

I went to the EC2 console and pressed Instance State > Hibernate.

The instance stopped within 30 seconds, and hibernation was successful. I started the instance again.

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

I went back to the console and pressed Instance State > Hibernate.

The follow was written to journalctl:

May 26 05:06:22 ip-172-31-45-200 systemd-logind[519]: Suspend key pressed.
May 26 05:06:22 ip-172-31-45-200 systemd-logind[519]: Requested suspend operation not supported, ignoring.
May 26 05:06:22 ip-172-31-45-200 kernel: Adding 4095996k swap on /swap-hibinit. Priority:-3 extents:5 across:4505596k SSFS
May 26 05:06:22 ip-172-31-45-200 root[2917]: ACPI action undefined: LNXSLPBN:00

The instance did not hibernate, and stayed running for 20 minutes, until it timed out and was force stopped.

We can see from the logs that /swap-hibinit was added at priority -3, and /swapfile is -2. /swapfile was chosen for hibernation, but as since kernel command line and grub is not set up for this, it fails.

I terminated the instance.

I then created a new instance, again a c5.large, with 20gb of storage, with Advanced > Enable Encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

I enabled -proposed, and installed ec2-hibinit-agent 1.0.0-0ubuntu11.21.10.1

Setting up ec2-hibinit-agent (1.0.0-0ubuntu11.21.10.1) ...
Installing new version of config file /etc/acpi/actions/sleep.sh ...
$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu11.21.10.1

I went to the EC2 console and pressed Instance State > Hibernate.

Again, the instance stopped within 30 seconds, and hibernation was successful.

I started the instance again, and configured a second swapfile:

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
4194304+0 records in
4194304+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) c...

Read more...

tags: added: verification-done-impish
removed: verification-needed-impish
Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (4.1 KiB)

Performing verification for Jammy.

For some reason hibernation just refused to work on any Xen based instance type, but it works fine on KVM based instances, such as c5, t3 etc. This is being looked into on bug 1968062, but I think it affects Jammy as well.

Diverging from the testcase, and using c5.large (kvm) instances instead of t2.medium (xen).

I started a c5.large instance with 20gb of storage, with advanced > enable encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

This is using the current version of ec2-hibinit-agent from -updates:

$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu11

I went to the EC2 console and pressed Instance State > Hibernate.

The instance stopped within 30 seconds, and hibernation was successful. I started the instance again.

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

I went back to the console and pressed Instance State > Hibernate.

The follow was written to journalctl:

May 26 05:06:22 ip-172-31-45-200 systemd-logind[519]: Suspend key pressed.
May 26 05:06:22 ip-172-31-45-200 systemd-logind[519]: Requested suspend operation not supported, ignoring.
May 26 05:06:22 ip-172-31-45-200 kernel: Adding 4095996k swap on /swap-hibinit. Priority:-3 extents:5 across:4505596k SSFS
May 26 05:06:22 ip-172-31-45-200 root[2917]: ACPI action undefined: LNXSLPBN:00

The instance did not hibernate, and stayed running for 20 minutes, until it timed out and was force stopped.

We can see from the logs that /swap-hibinit was added at priority -3, and /swapfile is -2. /swapfile was chosen for hibernation, but as since kernel command line and grub is not set up for this, it fails.

I terminated the instance.

I then created a new instance, again a c5.large, with 20gb of storage, with Advanced > Enable Encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I waited for hibinit-agent.service to complete by watching
$ sudo systemctl status hibinit-agent.service

I enabled -proposed, and installed ec2-hibinit-agent 1.0.0-0ubuntu11.22.04.1

Setting up ec2-hibinit-agent (1.0.0-0ubuntu11.22.04.1) ...
Installing new version of config file /etc/acpi/actions/sleep.sh ...
$ apt-cache policy ec2-hibinit-agent | grep Installed
  Installed: 1.0.0-0ubuntu11.22.04.1

I went to the EC2 console and pressed Instance State > Hibernate.

Again, the instance stopped within 30 seconds, and hibernation was successful.

I started the instance again, and configured a second swapfile:

From there, I made a swapfile, and enabled it:

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
4194304+0 records in
4194304+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) cop...

Read more...

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (5.0 KiB)

Performing verification for Bionic.

Bionic seems to hibernate okay on both Xen and KVM based instances, so I tested both t2.medium and c5.large instance types. Each had 20gb of storage, with advanced > enable encryption > yes.
I also made sure to enable Advanced settings > Stop State (Hibernation) Support > Enabled.

I started two sets of instances, one with 1.0.0-0ubuntu4~18.04.5 from -updates, and the other with 1.0.0-0ubuntu4~18.04.6 from -proposed.

After leaving each instance for a few minutes to finish setting up hibinit-agent.service, I pressed Instance State > Hibernate.

Both instances hibernated successfully, and within 30 seconds of pressing the hibernate button.

I then started both instances, and ssh'd in. My screen sessions were both active, so hibernation was successful.

The base case of no additional swapfile configured results in correct hibernation for both -updates and -proposed packages.

I then followed the below steps, and added an additional swapfile to each instance.

$ sudo fallocate -l 4G /swapfile
$ sudo dd if=/dev/zero of=/swapfile bs=1024 count=4194304
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
$ swapon --show
NAME TYPE SIZE USED PRIO
/swapfile file 4G 0B -2

I went back to the console and pressed Instance State > Hibernate.

This time, both instances hibernated successfully, and completed within 30 seconds of pressing the hibernate button. Bionic has a different behaviour to that of Focal and onward, where the -updates package would stay running, and be force stopped 20 minutes later. On Bionic, both hibernate successfully.

I then started both instances. Both instances came up correctly, and I could ssh in. But my screen sessions were missing, and journalctl showed this was a fresh boot for both instances. It seems Bionic has issues resuming from hibernation when there is an additional swapfile set, but most users do not notice it, because the instance comes up and starts correctly, as if hibernation had been successful.

Journalctl in both suggests that it wasn't aware that it was hibernated in the first place as no attempt to resume was made, so perhaps we are setting the resume= variable on the kernel command line wrong. I checked, and Bionic sets it as:

resume_offset=401408 resume=PARTUUID=80f6dacd-01

I checked the offsets manually with

$ findmnt -no PARTUUID -T /swap-hibinit
80f6dacd-01

$ sudo filefrag -v /swap-hibinit
Filesystem type is: ef53
File size of /swap-hibinit is 4194304000 (1024000 blocks of 4096 bytes)
 ext: logical_offset: physical_offset: length: expected: flags:
   0: 0.. 0: 401408.. 401408: 1:

everything matched. Very strange. It should have resumed...

Regardless of the outcome, I checked journalctl of the previous boot, and for the instance with -updates enabled, we see:

Jun 03 05:04:02 ip-172-31-26-1 systemd-logind[1108]: Suspend key pressed.
Jun 03 05:04:02 ip-172-31-26-1 systemd-logind[1108]: Requested operation not supported, ignoring.
Jun 03 05:04:02 ip-172-31-26-1 kernel: Adding 4095996k swap on /swap-hibinit. Priority:-3 extents:...

Read more...

tags: added: verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for ec2-hibinit-agent has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu11.22.04.1

---------------
ec2-hibinit-agent (1.0.0-0ubuntu11.22.04.1) jammy; urgency=medium

  * Swapon with maximum priority right before hibernation. This resolves
    swapfile priority issues with additional or multiple swapfiles enabled.
    (LP: #1968805)
    - d/p/lp1968805-Swapon-with-maximum-priority-before-hibernation.patch

 -- Matthew Ruffell <email address hidden> Wed, 13 Apr 2022 16:19:30 +1200

Changed in ec2-hibinit-agent (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu11.21.10.1

---------------
ec2-hibinit-agent (1.0.0-0ubuntu11.21.10.1) impish; urgency=medium

  * Swapon with maximum priority right before hibernation. This resolves
    swapfile priority issues with additional or multiple swapfiles enabled.
    (LP: #1968805)
    - d/p/lp1968805-Swapon-with-maximum-priority-before-hibernation.patch

 -- Matthew Ruffell <email address hidden> Wed, 13 Apr 2022 16:09:38 +1200

Changed in ec2-hibinit-agent (Ubuntu Impish):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu9.2

---------------
ec2-hibinit-agent (1.0.0-0ubuntu9.2) focal; urgency=medium

  * Swapon with maximum priority right before hibernation. This resolves
    swapfile priority issues with additional or multiple swapfiles enabled.
    (LP: #1968805)
    - d/p/lp1968805-Swapon-with-maximum-priority-before-hibernation.patch

 -- Matthew Ruffell <email address hidden> Wed, 13 Apr 2022 16:00:11 +1200

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu4~18.04.6

---------------
ec2-hibinit-agent (1.0.0-0ubuntu4~18.04.6) bionic; urgency=medium

  * Swapon with maximum priority right before hibernation. This resolves
    swapfile priority issues with additional or multiple swapfiles enabled.
    (LP: #1968805)
    - d/p/lp1968805-Swapon-with-maximum-priority-before-hibernation.patch

 -- Matthew Ruffell <email address hidden> Thu, 21 Apr 2022 17:03:54 +1200

Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
OpenStack (andy1723) wrote :

Hi all,

Appreciate everybody working on this issue.

Unfortunately, I am still experiencing an issue on AWS EC2 Ubuntu 20.04

099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220419

Linux 5.13.0-1025-aws #27~20.04.1-Ubuntu SMP Thu May 19 15:17:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

dpkg -l | grep hiber
ii ec2-hibinit-agent 1.0.0-0ubuntu9.2 all Amazon EC2 hibernation agent
ii hibagent 1.0.1-0ubuntu1 all Agent that triggers hibernation on EC2 instances

The system creates 2 swap files consuming almost all space:
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 12G 10G 1.7G 87% /

ls -lah /
-rw------- 1 root root 4.0G Jun 6 22:29 swap
-rw------- 1 root root 4.0G Jun 6 22:29 swap-hibinit

The system goes to Stop with around 20 min wait period on a hibernation attempt.

Thank you

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi andy1723,

What instance type are you using? A xen based type like t2, or similar? Hibernate seems to work fine on KVM based instances, e.g. c5, m5, and in my testing I also experienced issues with Xen based instance types. I am looking into those issues in a separate bug.

It is very interesting that you have both /swap and /swap-hibinit.

/swap-hibinit is from the ec2-hibinit-agent package, while /swap is from the older hibagent package. I am a little confused as to why both packages are seeded to AWS AMIs since they are both doing effectively the same thing, but I will ask the CPC team about it and let you know the result.

Is there anything special you did to trigger hibagent to make /swap? It seems to ship a old sysvinit script, /etc/init.d/hibagent which sets up the swapfile and configures grub. This shouldn't be running by default on a modern 20.04 system anymore.

I will keep looking into the hibernation issue on Xen instance types and will let you know.

Thanks,
Matthew

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers