software_resume()->read_suspend_image() can fail at get_zeroed_page()

Bug #36944 reported by Aaron Whitehouse
66
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

I have just upgraded to Dapper Flight 5 (fully updated) and have absolutely no success with hibernation. It either:
- Immediately re-awakes; or
- hangs with the screen blank and the only option to power-off.

Please let me know any information that you require to help solve this. Suspend works okay. I use a Dell Inspiron 510m.

This problem was first mentioned in Bug #3490

Changed in acpi-support:
assignee: nobody → mjg59
Revision history for this message
Paul Sladen (sladen) wrote :

How much RAM and swap space do you have?

  $ free -m

Does leaving the laptop a *long* time (eg. 15minutes) allow it to eventually complete the hibernate?

Changed in acpi-support:
status: Unconfirmed → Needs Info
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Hardware database ID: 835a2d0b2d075ce4051a553f1211537a
Link: http://hwdb.ubuntu.com/?xml=835a2d0b2d075ce4051a553f1211537a

I'll attach the output from free -m (to keep the formatting) and test leaving it for a long time. Obviously leaving it for some time would not help the occasions when it immediately re-awakes.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Memory used/available

Here is the requested output.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Re: Hibernate broken on Dell Inspiron 510m

Hi Paul,
I just gave it a go and, after 7 minutes, it powered off. On turning it on again it showed no signs of returning from hibernate; it was just a normal boot as if it had been shut down (except that the filesystem was 'unclean' and it had to fix errors).

Is there any other information from me that may be useful at this stage?

Revision history for this message
Paul Sladen (sladen) wrote :

Can you try waiting 7 minutes again, and check that both the kernel versions before and after are the same with:

  $ uname -r

if you've been following updates and downloaded a kernel beforehand then it will fail on restart as the version has changed.

BTW, the fact that it does power off is good news (in a sort of way! :-). It means that it does at least think that it's finishing.

The 'free -m' also shows you have plenty of swap space (which could be the reason that it comes back immediately and fails to start hibernating) - if you see that happen again, do:

  $ cat /proc/swaps

which will show you which swap partitions are active and in use.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Okay.
1) $ uname -r [before] = 2.6.15-19-386
2) Tried hibernating
3) It immediately popped up again, so ran cat /proc/swaps. I will attach.
4) Tried hibernating again. It powered off in under a minute. At first it looked like a resume from hibernate (leaving the GUI mode to go to text saying something about blocks or something) but then went to a normal boot again with an unclean filesystem. Needless to say that none of the programs I had open when I hibernated were open when Ubuntu started up again.
5) $ uname -r [after] = 2.6.15-19-386

Let me know what more you want.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : cat_procs readout

This is the result of the command which you asked me to type.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Re: Hibernate broken on Dell Inspiron 510m

More testing.
1) Another immediate re-awakening.
2) A 6 minute wait then power-off; when power back on it resumes from hibernate as desired (this time I right-clicked gnome-power-manager and clicked hibernate instead of selecting it in the log-off menu).
3) uname -r = 2.6.15-19-386

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

More testing (again):
1) select hibernate in GPM, immediately re-awakes
2) select hibernate in GPM, power off in < 1 min and successful resume.
3) select hibernate in GPM, power off in < 1 min and successful resume.

I'm fairly certain that using GPM usually makes no difference. Shall I keep trying for a complete fail hibernating with GPM?

Revision history for this message
Paul Sladen (sladen) wrote :

When you get a failed hibernate, what's in:

  /var/log/syslog

(cut it from the time just before you request hibernate). The problem of why it sometimes takes 5minutes of disk and doing nothing is a bigger one.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : GPM_long-poweroff_successful

This is the syslog from a hibernate that took between 6 and 15 minutes (sorry, I had to leave the room). When I resumed (powered on) it recovered as if all was well.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Purely successful hibernate with GPM

This hibernate worked fine. It took less than 1 min to power off and resumed properly when I powered it back up. For comparison.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Immediate re-awaken with GPM

This is a syslog from when the machine immediately re-awakens after hibernate is selected. It never powers off.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : This is a purely successful hibernate from the log-out menu

This hibernate worked perfectly (same machine) by selecting hibernate from the log-out menu.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : Turns off instead of hibernates

I selected hibernate from the log-out menu and it seemed to be hibernating correctly, and powered off in less than a minute. Upon powering it back on it was if it was being turned on after a shut down, gave the errors of an improper shutdown and brought up the initial logon screen instead of resuming my session.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote : An example of the machine immediately re-awakening when hibernate is selected from the log-out menu

For completeness, this is when the machine fails to hibernate at all (immediately re-awakes) when hibernate is selected from the log-out screen.

Revision history for this message
Paul Sladen (sladen) wrote : Re: Hibernate broken on Dell Inspiron 510m

This last one that comes back immediately is easy:

  swsusp: Need to copy 47008 pages
  swsusp: Not enough free memory

Matthew, I've forgotten, is that a shortage of swap, or a shortage of RAM to bounce pages around in? But we can ignore this one.

However, the bigger problem is the one that *fails* on unhibernate and leads to a kernel backtrace:

swsusp: Reading pagedir (125 pages)
busybox: page allocation failure. order:0, mode:0x8020
 [__alloc_pages+494/736] __alloc_pages+0x1ee/0x2e0
 [get_zeroed_page+34/80] get_zeroed_page+0x22/0x50
 [alloc_data_pages+55/208] alloc_data_pages+0x37/0xd0
 [read_suspend_image+155/192] read_suspend_image+0x9b/0xc0
 [swsusp_read+21/64] swsusp_read+0x15/0x40
 [software_resume+119/224] software_resume+0x77/0xe0
 [resume_store+161/164] resume_store+0xa1/0xa4
 [flush_write_buffer+39/48] flush_write_buffer+0x27/0x30
 [sysfs_write_file+63/96] sysfs_write_file+0x3f/0x60
 [vfs_write+150/336] vfs_write+0x96/0x150
 [sys_write+56/128] sys_write+0x38/0x80
 [syscall_call+7/11] syscall_call+0x7/0xb

Is this race to do with processes (eg. the busybox that initiated the unhibernate) continueing to run and no longer being able to access resources because they have, or are in the process of being, overwritten?

Changed in linux-source-2.6.15:
status: Needs Info → Confirmed
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Hello Paul, thanks for your work on all my bugs!

> Matthew, I've forgotten, is that a shortage of swap, or a shortage of RAM
> to bounce pages around in? But we can ignore this one.

By ignore, do you mean fix once the other problem is dealt with ;)? I had nothing running and have a moderate amount of memory on here...

Let me know if there is anyhting further that you need from me :).

Revision history for this message
Paul Sladen (sladen) wrote :

Hi Pavel,

I have the following backtrace recorded in an Ubuntu bug-report:

  https://launchpad.net/bugs/36944

  swsusp: Reading pagedir (125 pages)
  busybox: page allocation failure. order:0, mode:0x8020
   [__alloc_pages+494/736] __alloc_pages+0x1ee/0x2e0
   [get_zeroed_page+34/80] get_zeroed_page+0x22/0x50
   [alloc_data_pages+55/208] alloc_data_pages+0x37/0xd0
   [read_suspend_image+155/192] read_suspend_image+0x9b/0xc0
   [swsusp_read+21/64] swsusp_read+0x15/0x40
   [software_resume+119/224] software_resume+0x77/0xe0
   [resume_store+161/164] resume_store+0xa1/0xa4
   [flush_write_buffer+39/48] flush_write_buffer+0x27/0x30
   [sysfs_write_file+63/96] sysfs_write_file+0x3f/0x60
   [vfs_write+150/336] vfs_write+0x96/0x150
   [sys_write+56/128] sys_write+0x38/0x80
   [syscall_call+7/11] syscall_call+0x7/0xb

This is using a initiated from an initramfs (that's what the busybox process
is) with:

  if [ -e /sys/power/resume ]; then
      major=$((0x$(stat -c%t ${resume})))
      minor=$((0x$(stat -c%T ${resume})))
      echo ${major}:${minor} >/sys/power/resume
  fi

Any idea how swsusp can end up in this state? The affect of this is that
the syscall fails and bootup proceeds normally.

Many Thanks,

 -Paul
--
Britain is just cold, in a pesky way. Southampton, GB

Revision history for this message
Paul Sladen (sladen) wrote :

On Thu, 30 Mar 2006, Pavel Machek wrote:
> Not enough memory to do the resume. Try calling swsusp_shrink_memory()
> before doing resume.

Ah I've back through my LKML and dug up the threads over the last
month. Was there any further progress beyond the 2006-03-18 patch by Con
Kolivas against -mm?

  http://www.uwsg.iu.edu/hypermail/linux/kernel/0603.2/0817.html

 -Paul
--
Britain is just cold, in a pesky way. Southampton, GB

Revision history for this message
none (ubuntu-bugs-nullinfinity-deactivatedaccount) wrote :

This is probably too late for Dapper, but please, is there any way the Ubuntu kernels can switch to using suspend2 in the future?

Recently I haven't had the time to build my own kernels, but back when I did, suspend2 was rock solid. Never a single failure to either suspend or resume. Also, the system comes back in an immediately usable state, unlike swsusp which brings on a big swap storm when it resumes. Or maybe I should say that it did so last time it worked.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I have the busybox problem*, but not the other issues mentioned in the original bug report, so I think there are several independent issues. One problem is that the busybox issue makes it difficult to debug the other hibernation issues**...

I see the busybox issue on two very different laptops that I use, so I would interpolate that this would be a major problem for a lot of people.

*duplicate bug #41375
**bug #38500, bug #41340, bug #23015

Revision history for this message
Tormod Volden (tormodvolden) wrote :

I haven't seen this for a while, in neither dapper nor edgy, so I would say we can close this bug now.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Hibernation is much, *much* better in Edgy. I haven't had this error and I have done a fair few hibernates. If it is still happening then I have been lucky enough not to have it happen to me :).

I agree that it can probably be closed.

Revision history for this message
Timothy Smith (tas50) wrote :

Appears to be fixed in Edgy

Changed in linux-source-2.6.15:
status: Confirmed → Fix Committed
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Moving status from "Fix Committed" to "Fix Released" as this bug appears to have been resolved per the last sets of comments. Thanks.

Changed in linux-source-2.6.15:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.