Unexpected memory allocation failure in hibernation snapshot (swsusp)

Bug #1724265 reported by Dirk F
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

In kernel 4.4.0-97, the system snapshot can fail during suspend-to-disk or -to-both even though the routine swsusp_save() believes that there is enough memory for the snapshot.

What I expected to happen
=========================

After 'systemctl hibernate' or 'systemctl hybrid-sleep', the requested suspend state is reached, or a diagnostic is provided.

What happened instead
=====================

After 'systemctl hibernate' or 'systemctl hybrid-sleep', system returns to normal desktop with no visible diagnostic after 'systemctl hibernate' or 'systemctl hybrid sleep'.

System log/journal shows diagnostics and backtrace, as attached:

...
Oct 17 10:51:45 Spiridion kernel: PM: Creating hibernation image:
Oct 17 10:51:45 Spiridion kernel: PM: Need to copy 174616 pages
Oct 17 10:51:45 Spiridion kernel: PM: Normal pages needed: 112567 + 1024, available pages: 115651
Oct 17 10:51:45 Spiridion kernel: s2both: page allocation failure: order:0, mode:0x2080120
...
Oct 17 10:51:45 Spiridion kernel: [<c13ae82f>] dump_stack+0x58/0x79
Oct 17 10:51:45 Spiridion kernel: [<c11783e6>] warn_alloc_failed+0xd6/0x110
Oct 17 10:51:45 Spiridion kernel: [<c117a994>] __alloc_pages_slowpath.constprop.104+0x6c4/0x970
Oct 17 10:51:45 Spiridion kernel: [<c117ae66>] ? __alloc_pages_nodemask+0x226/0x280
Oct 17 10:51:45 Spiridion kernel: [<c117ae66>] __alloc_pages_nodemask+0x226/0x280
Oct 17 10:51:45 Spiridion kernel: [<c10c000f>] alloc_image_page+0x1f/0x40
Oct 17 10:51:45 Spiridion kernel: [<c10c1678>] swsusp_save+0x148/0x4b0
...

And the concluding entries after the memory debugging output:

Oct 17 10:51:45 Spiridion kernel: PM: Memory allocation failed
Oct 17 10:51:45 Spiridion kernel: PM: Error -12 creating hibernation image

At this point the swsusp_save() function has successfully checked for free memory

 enough_free_mem(nr_pages, nr_highmem)

and then called

 swsusp_alloc(&orig_bm, &copy_bm, nr_pages, nr_highmem)

So either enough_free_mem() is not sufficiently accurate or conservative, or swsusp_alloc() is not sufficiently aggressive in using the available memory.

This code seems to be the same at least down to the call to alloc_image_page() in later versions, eg 4.13.7.

The functions 'systemctl hibernate', 'systemctl hybrid-sleep' have been seen to work as expected in the identical configuration when the system memory is more lightly loaded (eg "PM: Need to copy 103159 pages").

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-generic 4.4.0.97.102
ProcVersionSignature: Ubuntu 4.4.0-97.120-generic 4.4.87
Uname: Linux 4.4.0-97-generic i686
ApportVersion: 2.20.1-0ubuntu2.10
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: df 2143 F.... pulseaudio
CurrentDesktop: LXDE
Date: Tue Oct 17 14:52:34 2017
HibernationDevice: #RESUME=UUID=0107bf90-3e69-4f56-9a92-3477dd28b31c
InstallationDate: Installed on 2017-02-12 (247 days ago)
InstallationMedia: LXLE 16.04 - Release i386
MachineType: Dell Inc. Inspiron 1520
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-97-generic root=UUID=b71d2286-3b2f-4856-a258-2f6bcf088c70 ro zswap.enabled=1 resume=UUID=b71d2286-3b2f-4856-a258-2f6bcf088c70 resume_offset=2363392 quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-97-generic N/A
 linux-backports-modules-4.4.0-97-generic N/A
 linux-firmware 1.157.12
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/11/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A09
dmi.board.name: 0KY767
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA09:bd07/11/2008:svnDellInc.:pnInspiron1520:pvr:rvnDellInc.:rn0KY767:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Inspiron 1520
dmi.sys.vendor: Dell Inc.

Revision history for this message
Dirk F (fieldhouse) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc5

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Dirk F (fieldhouse) wrote :

>Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Not as far as I know. I think this code has been the same for a few years, and as we know hibernate was explicitly disabled by default in Ubuntu because it couldn't be made to work reliably across a range of machines.

>Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.14 kernel[0].

I'll have a go with the linked PPA.

Revision history for this message
Dirk F (fieldhouse) wrote :

... except it's not a real PPA - wouldn't that have been a more natural approach? However I'll try nonetheless.

Revision history for this message
Dirk F (fieldhouse) wrote :

Testing with 4.14-rc5 I can enter hybrid-sleep with low memory usage but the system hangs after blanking the screen and displaying "Snapshotting system". In this case there are no log entries for the failure, ie the journal is truncated after initiating hybrid-sleep and then disconnecting the wifi (on NetworkManager receiving a suspend signal).

It seems plausible that this is the same failure mode but with different log entries because of timing/memory differences, so I am marking the issue kernel-bug-exists-upstream. At any rate *some* kernel-bug-exists-upstream.

Not relevant to this issue, I think:

- the default 16.04.3 apparmor settings disabled /sbin/dhclient and /usr/sbin/cups-browsed with kernel 4.14rc5 - I renamed /etc/apparmor.* and rebooted

- the system log showed repeated timeout crashes in inteldrm like that attached below.

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Dirk F (fieldhouse) wrote :

Per above comment, attached inteldrm crash log.

Revision history for this message
Yann Salmon (yannsalmon) wrote :

I am experiencing what I think is the same problem with kernel 4.17.0-996-generic.

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.