Resume from disk (swapfile) fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
initramfs-tools (Ubuntu) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
Binary package hint: initramfs-tools
Ubuntu Lucid 10.04 fully updated as of today (2 apr 2010) AMD64
initramfs-tools 0.92bubuntu71
I can hibernate my notebook but when I try to wake it up again it starts with a normal boot.
I would say it doesn't even *try* to resume.
I'm using a swapfile and I configured the kernel command line accordingly:
BOOT_IMAGE=
also my /etc/initramfs-
resume=
Nevertheless, when the pc boots I always get this in my dmesg, regardless if it was a real clean boot or if it was a boot after an hibernation:
[ 0.778970] PM: Checking image partition UUID=f0e784ed-
[ 0.785267] PM: Resume from disk failed.
My notebook is a dell precision m6400 and hibernation used to work fine (with the same setup) in karmic.
I tried myself to narrow down this problem and I came out with this conclusion:
In /usr/share/
SWAPTYPE=
It is my understanding (but I might be wrong) that the function "wait-for-root" can only check the partition whose UUID is passed as ${resume}, but it has no idea about the offset.
So I think that it works only for swap partitions but not for swap files.
I would like to make the hybernation/resume thing work for my laptop, so if I can be of any help testing things or if I can give any other info I'll be glad to do that.
Changed in initramfs-tools (Ubuntu): | |
importance: | Undecided → Medium |
status: | New → Triaged |
tags: | added: regression-potential |
Scott James Remnant (Canonical) (canonical-scott) wrote : | #1 |
Changed in initramfs-tools (Ubuntu): | |
status: | Triaged → Incomplete |
tags: | removed: regression-potential |
Scott James Remnant (Canonical) (canonical-scott) wrote : | #2 |
After booting, please attach your /etc/fstab and output of running "sudo blkid"
This should allow us to figure out what your resume= line *should* be
Scott James Remnant (Canonical) (canonical-scott) wrote : | #3 |
Oh, sorry, you say you're using a swap file - in which case how do you know that the offset of that file doesn't change between reboots - or how do you know it's contiguous. We don't really support resume from swap files at the moment.
Assuming that it's correct to pass the UUID of the filesystem in - it's not wait-for-root that's failing here, but whatever actually loads the hibernation image
Moreno Bartalucci (moreno-bartalucci) wrote : | #4 |
Hi Scott,
my swapfile has been created once for all, so unless files are moving by themselves in the hard drive its offset should never change.
In any case, I checked it across reboots and it's always the same.
If I read a few sectors from the hard drive with 'dd' starting from that offset and then I run a 'file' command on the data I collected I get a swap signature, so I'm sure the offset is correct.
As I stated in my original bug report, resuming from swapfile used to work correctly on karmic: why do you say you don't support it? Do you mean: you don't support it anymore?
Somewhere, in one of the ubuntu documentation pages (sorry I don't have the link at hand), I read using swapfiles should be preferred instead of swap partitions: so, in my opinion, resuming from swapfiles *should* be supported.
At the end of the script '/usr/share/
SWAPTYPE=
case "${SWAPTYPE}" in
swsuspend|
if [ -x /bin/plymouth ] && plymouth --ping; then
fi
# hardcode path, uswsusp ships an resume binary too
if [ -n "${resume_offset}" ]; then
else
fi
;;
esac
So: 'wait-for-root' does *NOT* only wait for the partition to become active, it also detects *IF* any hybernation occurred and it decides the kind of hybernation data.
Question: if 'wait-for-root' doen't know about the offset of my swapfile, how can it know if my pc was hybernated or not?
The 'resume' program, in my case and in my opinion, is *NOT* executed at all, so it can't be its fault.
Thank you for your help on this matter, I really appreciate it!
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 554009] Re: Resume from disk (swapfile) fails | #5 |
On Thu, 2010-04-22 at 06:28 +0000, Moreno Bartalucci wrote:
> As I stated in my original bug report, resuming from swapfile used to work correctly on karmic: why do you say you don't support it? Do you mean: you don't support it anymore?
>
No, I mean we've never supported this. We've never put any effort into
making it work, we don't test to make sure it works, and as far as we
know, it doesn't work.
We discussed doing the work in both Karmic and Lucid, but developers
haven't had the time.
If it worked for you before, that was more out of luck than intent ;-)
> so, in my opinion, resuming from swapfiles *should* be supported.
>
I don't disagree at all! We want to support swap files, and resuming
from them, we just haven't done the work yet.
> At the end of the script '/usr/share/
> premount/resume', you can find this:
>
> SWAPTYPE=
>
> case "${SWAPTYPE}" in
> swsuspend|
> if [ -x /bin/plymouth ] && plymouth --ping; then
> plymouth message --text="Resuming from $resume"
> fi
>
> # hardcode path, uswsusp ships an resume binary too
> if [ -n "${resume_offset}" ]; then
> /bin/resume ${resume} ${resume_offset} >/dev/null 2>&1
> else
> /bin/resume ${resume} >/dev/null 2>&1
> fi
> ;;
> esac
>
> So: 'wait-for-root' does *NOT* only wait for the partition to become active, it also detects *IF* any hybernation occurred and it decides the kind of hybernation data.
>
Actually it doesn't
wait-for-root simply returns the block device type as probed by udev, it
doesn't do any probing of its own. udev obviously doesn't know about
the resume offset.
> Question: if 'wait-for-root' doen't know about the offset of my swapfile, how can it know if my pc was hybernated or not?
>
As mentioned above, wait-for-root doesn't probe; it's simply a
replacement for a while loop done in shell to wait for device nodes to
appear.
In fact, in your case, it's almost certainly better than the previous
shell since it will return immediately and make resume faster (if the
other bit worked).
> The 'resume' program, in my case and in my opinion, is *NOT* executed at all, so it can't be its fault.
>
This bit confuses me; your bug reports includes the following messages
from dmesg:
[ 0.778970] PM: Checking image partition UUID=f0e784ed-
[ 0.785267] PM: Resume from disk failed.
Doesn't this come from the "resume" program?
> Thank you for your help on this matter, I really appreciate it!
>
So the problem with this script is that the script is only written to
assume resume-from-block device. It uses the udev probed block device
type to determine whether to run "resume" or not.
It should have a fallback for the case of swap-file (which I guess is
resume_offset is set?) that calls resume directly and ignores the
returned filesystem type.
Of course, this is still a hack - in order to support resume from swap
file properly, we need to not require resume_offset but auto-detect it;
there are runes for that somewhere on ...
Moreno Bartalucci (moreno-bartalucci) wrote : | #6 |
- Patch to the original /usr/share/initramfs-tools/scripts/local-premount/resume file Edit (345 bytes, text/plain)
Hello,
so I changed my '/usr/share/
I noticed, however, something strange in my 'new' dmesg (please find it here attached).
In a normal boot (no hibernation before), beside the 'usual' lines:
[ 0.708670] PM: Checking image partition UUID=f0e784ed-
[ 0.714705] PM: Resume from disk failed.
now I have also:
[ 4.948849] PM: Starting manual resume from disk
[ 4.948852] PM: Resume from partition 8:3
[ 4.948853] PM: Checking hibernation image.
[ 4.967541] PM: Error -22 checking image file
[ 4.967544] PM: Resume from disk failed.
Which is normal: with my patch, the resume is always attempted.
But here comes my question: why is it attempted twice?
And the timing is also very different (0.7 vs. almost 5).
Who else is attempting to resume?
Am I looking in the wrong place?
Moreno Bartalucci (moreno-bartalucci) wrote : | #7 |
Scott James Remnant (Canonical) (canonical-scott) wrote : | #8 |
On Fri, 2010-04-23 at 07:24 +0000, Moreno Bartalucci wrote:
> so I changed my '/usr/share/
>
Right, this patch would appear to be trivially correct.
The other missing piece is automatically setting resume_offset based on
a given filename on hibernate, so it's used on the next reboot.
> I noticed, however, something strange in my 'new' dmesg (please find it here attached).
> In a normal boot (no hibernation before), beside the 'usual' lines:
> [ 0.708670] PM: Checking image partition UUID=f0e784ed-
> [ 0.714705] PM: Resume from disk failed.
> now I have also:
> [ 4.948849] PM: Starting manual resume from disk
> [ 4.948852] PM: Resume from partition 8:3
> [ 4.948853] PM: Checking hibernation image.
> [ 4.967541] PM: Error -22 checking image file
> [ 4.967544] PM: Resume from disk failed.
>
Kooky.
> But here comes my question: why is it attempted twice?
>
I've no idea, I don't really know much about how this stuff works.
Scott
--
Scott James Remnant
<email address hidden>
Moreno Bartalucci (moreno-bartalucci) wrote : | #9 |
Hi Scott,
I think the first couple of messages comes from the kernel.
kernel/
It also seems to check for any hibernation image, it parses the resume partition as well as the resume offset from the kernel command line but it seems to check only for the partition.
It doesn't seem to use the resume offset for any purpose.
I could try to change hibernate.c myself to do some trials but here I miss the basics: what is the situation at the time hibernate.c runs? Is the root partition already mounted?
Anybody can give me some hints on where to look for this kind of information?
Thanks!
Chow Loong Jin (hyperair) wrote : | #10 |
Okay, so I've been taking a look into this as well in order to fix hibernation on my system which uses TuxOnIce. My issue seems to be related to the wait-for-root check returning "swap" rather than "tuxonice" despite there being a tuxonice signature on it. Digging further, this would appear to be blkid (from util-linux)'s fault, which I am now downloading debugging symbols to check.
Further investigation has shown me that initramfs-tools has worked previously because the resume was attempted unconditionally, but now it is bound by a userspace swap signature check (which fails miserably).
Now, it seems that wait-for-root does two things:
1. Wait for device
2. Check for signature
3. Return the swap type (swap|s1suspend
However, the signature checked in #2 is actually checked in fixed locations, i.e. the superblock of the swap. Since you use resume_offset, the signature is no longer in the said superblock of the file passed in resume=. For my case, the check against the TuxOnIce signature is failing for some mysterious reason or other, which I am yet to find out.
That basically defines the whole problem, as I understand it right now. From here on, I see two possible solutions to fix this:
1. Improve wait-for-root, udev, and blkid to be able to check for signatures even with resume_offset. This will not be easy, and will result in lots of code changes. TuxOnIce detection will also be needed, of course.
2. Bring back the previous behaviour, i.e. tell the splash that it's attempting to resume, attempt to resume, and if resume failed, update the splash and go along its merry way. This is the easiest path, and will definitely not result in false negatives.
For method #1, I think the *real* method to fix this would be to somehow hook up to the signature-detecting logic within the kernel, to avoid duplication of this code. The current solution duplicates the kernel checks, and badly, hence resulting in this bug.
For method #2, we'll have to worry about the UI, since the resume text will now appear on plymouth. Alternatively we could just remove it altogether, or like the patch Moreno attached, just try resuming anyway in the event that blkid reports a negative, so that we can catch false negative cases.
If possible, I'd like like this as a SRU.
Moreno Bartalucci (moreno-bartalucci) wrote : | #12 |
I checked hibernate.c but I think it leads nowhere.
It doesn't even find my root partition, that means (as far as I can understand) that it is executed far too early to succeed.
Coming back to the resume script in initramfstools, the piece of code which used to work for me in karmic is this:
magic=SWAPSPACE
if [ -n "${resume_offset}" ]; then
else
offset=4076
fi
str=$(dd if=$resume bs=1 skip=$offset count=19 2>/dev/null)
magic_found=
if [ "$str" = "$magic" ]; then
fi
Of course it takes care for only one type of suspend signature (S1SUSPEND) but it could be adapted to check the others too.
I believe we still need the wait-for-root's job to wait for the partitions to appear.
In case wait-for-root gives a negative result, or, alternatively, in any case a resume_offset is defined, this piece of code could be triggered to further check the swap signature.
When I have 5 minutes I'll try it myself and I'll post the resulting patch.
Chow Loong Jin (hyperair) wrote : | #13 |
Okay, so I've been taking a look into this as well in order to fix hibernation on my system which uses TuxOnIce. My issue seems to be related to the wait-for-root check returning "swap" rather than "tuxonice" despite there being a tuxonice signature on it. Digging further, this would appear to be blkid (from util-linux)'s fault, which I am now downloading debugging symbols to check.
Further investigation has shown me that initramfs-tools has worked previously because the resume was attempted unconditionally, but now it is bound by a userspace swap signature check (which fails miserably).
Now, it seems that wait-for-root does two things:
1. Wait for device
2. Check for signature
3. Return the swap type (swap|s1suspend
However, the signature checked in #2 is actually checked in fixed locations, i.e. the superblock of the swap. Since you use resume_offset, the signature is no longer in the said superblock of the file passed in resume=. For my case, the check against the TuxOnIce signature is failing for some mysterious reason or other, which I am yet to find out.
That basically defines the whole problem, as I understand it right now. From here on, I see two possible solutions to fix this:
1. Improve wait-for-root, udev, and blkid to be able to check for signatures even with resume_offset. This will not be easy, and will result in lots of code changes. TuxOnIce detection will also be needed, of course.
2. Bring back the previous behaviour, i.e. tell the splash that it's attempting to resume, attempt to resume, and if resume failed, update the splash and go along its merry way. This is the easiest path, and will definitely not result in false negatives.
For method #1, I think the *real* method to fix this would be to somehow hook up to the signature-detecting logic within the kernel, to avoid duplication of this code. The current solution duplicates the kernel checks, and badly, hence resulting in this bug.
For method #2, we'll have to worry about the UI, since the resume text will now appear on plymouth. Alternatively we could just remove it altogether, or like the patch Moreno attached, just try resuming anyway in the event that blkid reports a negative, so that we can catch false negative cases.
If possible, I'd like like this as a SRU.
Changed in initramfs-tools (Ubuntu): | |
status: | Incomplete → Triaged |
Scott James Remnant (Canonical) (canonical-scott) wrote : | #14 |
On Fri, 2010-04-30 at 08:41 +0000, Chow Loong Jin wrote:
> That basically defines the whole problem, as I understand it right now. From here on, I see two possible solutions to fix this:
> 1. Improve wait-for-root, udev, and blkid to be able to check for signatures even with resume_offset. This will not be easy, and will result in lots of code changes. TuxOnIce detection will also be needed, of course.
>
You've gone barking up the wrong tree here, this is only supposed to be
the signature of the block device *itself* returned from the udevdb.
When resume_offset is set, a further probe would be warranted. There's
no reason to modify any of the blkid stack.
> 2. Bring back the previous behaviour, i.e. tell the splash that it's
> attempting to resume, attempt to resume, and if resume failed, update
> the splash and go along its merry way. This is the easiest path, and
> will definitely not result in false negatives.
>
This doesn't work for people who have swap on a different device that we
have to wait for. That's a great big false negative.
> If possible, I'd like like this as a SRU.
>
No.
We don't add features in SRU. If this worked for you in the past, it
wasn't by intent, it was by luck.
Scott
--
Scott James Remnant
<email address hidden>
Chow Loong Jin (hyperair) wrote : | #15 |
About that false negative you pointed out, I think you are getting the wrong idea. I never intended to remove the wait-for-root call. In my proposed solution, the wait-for-root call stays, and *AFTER* that, it unconditionally attempts a resume, without checking the output of wait-for-root. Hence, the resume script will wait for the swap device, and then attempt a resume no matter what wait-for-root says.
And about the SRU...
From the looks of it, resume_offset was a feature of the past initramfs-tools. Dropping support for this, however accidental, is considered a regression. The risk of adding back the resume_offset check from the old code, and perhaps a TOI signature check can only result in a false positive, which is completely harmless, as the kernel would just step over it, as it had been doing in Karmic, Jaunty, and every earlier Ubuntu that had this resume script, where every normal boot was a false positive attempt at resuming.
Scott James Remnant (Canonical) (canonical-scott) wrote : | #16 |
On Fri, 2010-04-30 at 19:45 +0000, Chow Loong Jin wrote:
> About that false negative you pointed out, I think you are getting the
> wrong idea. I never intended to remove the wait-for-root call. In my
> proposed solution, the wait-for-root call stays, and *AFTER* that, it
> unconditionally attempts a resume, without checking the output of wait-
> for-root. Hence, the resume script will wait for the swap device, and
> then attempt a resume no matter what wait-for-root says.
>
We don't want to unconditionally attempt a resume; we should only
attempt a resume if there was a signature.
Unconditionally attempting resumes can cause problems.
> And about the SRU...
> >From the looks of it, resume_offset was a feature of the past
> initramfs-tools. Dropping support for this, however accidental, is
> considered a regression.
>
It was never a feature we supported - therefore it is not a regression.
Scott
--
Scott James Remnant
<email address hidden>
Chow Loong Jin (hyperair) wrote : | #17 |
On Saturday 01,May,2010 06:29 AM, Scott James Remnant wrote:
> On Fri, 2010-04-30 at 19:45 +0000, Chow Loong Jin wrote:
>
>> About that false negative you pointed out, I think you are getting the
>> wrong idea. I never intended to remove the wait-for-root call. In my
>> proposed solution, the wait-for-root call stays, and *AFTER* that, it
>> unconditionally attempts a resume, without checking the output of wait-
>> for-root. Hence, the resume script will wait for the swap device, and
>> then attempt a resume no matter what wait-for-root says.
>>
> We don't want to unconditionally attempt a resume; we should only
> attempt a resume if there was a signature.
>
> Unconditionally attempting resumes can cause problems.
Like what? I had already mentioned -- a false positive will be ignored by the
kernel, and on non-Ubuntu systems without initrds, which are pretty common, the
resume= kernel option is checked unconditionally.
Plus, we've never had problems arising from false positives in the past, so they
shouldn't be appearing now, when reverting to old behaviour.
>
>> And about the SRU...
>> >From the looks of it, resume_offset was a feature of the past
>> initramfs-tools. Dropping support for this, however accidental, is
>> considered a regression.
>>
> It was never a feature we supported - therefore it is not a regression.
resume_offset was supported, and was *explicitly* checked for in the resume
script. I don't see how that becomes a feature that was never supported.
--
Kind regards,
Chow Loong Jin
Moreno Bartalucci (moreno-bartalucci) wrote : | #18 |
- Added a further check with blkid. Edit (626 bytes, text/plain)
I tried another way to check if there has been a hibernation or not.
I found useful the blkid utility (which is included already in the initramfs image).
I may pass a device name as well as an offset to blkid to check the contents of the device.
In turn, it prints various info but now I'm interested only in the 'type' info, which appears to be the same of the udev one with the exception that it can check at an arbitrary offset.
So, I made a new patch to be applied to /usr/share/
Please find it here attached.
I think I still need to call wait-for-root to wait for the device to appear, although I'm not completely sure about it.
I'd like to hear comments about this patch.
Chow Loong Jin (hyperair) wrote : | #19 |
On Sunday 02,May,2010 05:23 PM, Moreno Bartalucci wrote:
> I tried another way to check if there has been a hibernation or not.
> I found useful the blkid utility (which is included already in the initramfs image).
> I may pass a device name as well as an offset to blkid to check the contents of the device.
> In turn, it prints various info but now I'm interested only in the 'type' info, which appears to be the same of the udev one with the exception that it can check at an arbitrary offset.
>
> So, I made a new patch to be applied to /usr/share/
> tools/scripts/
>
> Please find it here attached.
>
> I think I still need to call wait-for-root to wait for the device to
> appear, although I'm not completely sure about it.
>
> I'd like to hear comments about this patch.
>
>
> ** Patch added: "Added a further check with blkid."
> http://
>
It looks like a good idea, but.. is there a /conf/page_
seen that file around before. Moreover, blkid appears to assume the page size to
be 4096, so if it's anything other than 4096, it'd fail to detect the swsusp
signature anyway (swsusp puts its signatures at the end of the first page).
Another thing: you don't need to keep track of the output of wait-for-root, so
it should be sufficient to just replace that line with something like:
wait-for-root "${resume}" ${RESUMEDELAY:-5} instead.
As for the offset_option calculation, you could just use something like...
offset_
value of "0" if resume_offset is not set.
--
Kind regards,
Chow Loong Jin
Moreno Bartalucci (moreno-bartalucci) wrote : | #20 |
- Small prog to check and print the type of a block device. Edit (2.5 KiB, text/plain)
Hello,
yes, PAGE_SIZE was one of my concerns too.
I just borrowed those few lines from karmic's resume script, assuming they were correct.
In the meantime I got blkid source code and I made a small program which does the same (check the device and prints the 'type' field) getting the page size from the system through the getpagesize() function.
For some reason, anyway, it works in my system but it doesn't work in the initramfs environment yet.
I need to investigate further on this.
I attach this small program so you can have a view on it.
I'll try to further reduce it and to check why it is not working in the initramfs.
It's usage is: probe devname offset
Offset can be provided or not (if not it assumes offset=0).
It does the calculation getting the page size using getpagesize().
Moreno Bartalucci (moreno-bartalucci) wrote : | #21 |
P.S. Thanks for the hint on the offset_option calculation!
Chow Loong Jin (hyperair) wrote : | #22 |
On Monday 03,May,2010 02:13 PM, Moreno Bartalucci wrote:
> Hello,
>
> yes, PAGE_SIZE was one of my concerns too.
> I just borrowed those few lines from karmic's resume script, assuming they were correct.
> In the meantime I got blkid source code and I made a small program which does the same (check the device and prints the 'type' field) getting the page size from the system through the getpagesize() function.
> For some reason, anyway, it works in my system but it doesn't work in the initramfs environment yet.
> I need to investigate further on this.
> I attach this small program so you can have a view on it.
> I'll try to further reduce it and to check why it is not working in the initramfs.
>
> It's usage is: probe devname offset
>
> Offset can be provided or not (if not it assumes offset=0).
> It does the calculation getting the page size using getpagesize().
>
> ** Attachment added: "Small prog to check and print the type of a block device."
> http://
>
It looks like a C file. Perhaps it's possible that you need something in
/usr/share/
all binaries are copied in, after all.
--
Kind regards,
Chow Loong Jin
Moreno Bartalucci (moreno-bartalucci) wrote : | #23 |
Since I don't know very well, yet, how things work, I just modified my mkinitramfs script in order to add that file too.
Then, I unpacked the new file created by mkinitramfs to make sure all the relevant things were there (my binary and its required libs).
Apparently they are all there (after all, it cannot require more/other libs than the original blkid, which works correctly).
It must be something else but, at the moment, I don't have a clue.
The applet starts because I changed it temporarily in order to print out the value returned by getpagesize(): it works (it prints 4096, in my case).
Any hint?
maximilian attems (maks-debian) wrote : | #24 |
Newer initramfs-tools in Maverick supports resume_offset
Changed in initramfs-tools (Ubuntu): | |
status: | Triaged → Fix Released |
Ryo Onodera (ryoqun) wrote : | #25 |
Well, my setup needed this patch (https:/
And it isn't applied to the package.
I'd like to help this is really fixed.
In short, my setup is swapfile inside the system filesystem (LUKS + LVM2 + ext4).
I roughly followed this tutorial to achive that: https:/
Ryo Onodera (ryoqun) wrote : | #26 |
Oops, I pasted wrong URL for the patch...
This is the URL I intended to quote: https:/
Ryo Onodera (ryoqun) wrote : | #27 |
As this bug was closed with "fix-released", I'm re-opening as another bug report: https:/
tim474 (tim474) wrote : | #28 |
What is as result? How can I setup hibernation if I use swap file on LUKS partition?
tim474 (tim474) wrote : | #29 |
P.S. I use ubuntu 16.04.
wait-for-root simply waits for the block device to be available; that must be succeeding because you're seeing the resume fail:
[ 0.778970] PM: Checking image partition UUID=f0e784ed- e140-4d73- 87a3-8ef047012b 9f
[ 0.785267] PM: Resume from disk failed.
I noticed that your resume= and root= lines are identical. That's clearly wrong! The UUID for resume= should be that of your swap partition, not your root partition!