Comment 0 for bug 670096

Revision history for this message
Purko Balkanski (purko) wrote :

Binary package hint: initrd-tools

Greetings!

The Bug:
--------
When booting Ubuntu from a frugal install, or when booting Ubuntu through Grub2 directly from the Ubuntu LiveCD ISO-file, the booting process fails if there's a partition with Windows hibernated on it.

Reportedly such boot also fails if there's an encrypted partition, logically before the partition which holds the Ubuntu files.

In short: If such a partition exists, which...
  ... is in the device order BEFORE the partition on which the Ubuntu's "filesystem.squashfs" resides, ...
  ... AND that partition contains a supported file system, ...
  ... AND that partition is for some reason not able to be mounted, ...
that causes the booting process to panic and fail.

Distros affected:
--------------------------
I encountered the problem with: ubuntu-10.10-desktop-i386.iso
The problem likely exists in all distros that use the same Casper startup scripts.
I confirmed the same problem with the downstream linuxmint-10-gnome-rc-i386.iso

The Problem:
------------
When the OS takes over from the boot loader, it needs to find its "filesystem.squashfs". It knows the path, we've supplied it through the "iso-scan/filename=" boot option, but it doesn't know on which device it is on.

So what the init scripts currently do, is they start a big search loop through all partitions on all devices, mounting them in a row, looking for the needed path in the file systems of those partitions.

So far so good. But the problem is, if that search encounters a partition which for some reason can't be mounted, instead of silently ignoring that partition and moving on, the init script starts trying to mount that same partition over and over again, for a long time, until it finally throws a tantrum, raises panic, and sets my computer on fire.

On my computer, Windows is usually hibernated on the first partition of the first disk. Unfortunately, that is the first partition Ubuntu would encounter in its attempted search through all the partitions. If I remove that hibernation file (which I don't like!), it allows Ubuntu to boot normally.

On closer look, it turns out the culprit is in the file called "lupin-helpers". No mater how I think about it, I find no good reason why it would raise panic from within the loop if a certain partition refuses to mount. What it really needs to do is to just silently ignore it, and move on looking through the rest of the loop.

Posible workaround...
---------------------
Since Windows installations are most often on the first partition, my first impulse was to simply reverse the device order in which the search loops.

i.e., in "lupin-helpers", change this line:
         for dev in $(subdevices "${sysblock}"); do
to:
         for dev in $(subdevices "${sysblock}" | tac -s' '); do

That would "fix" the problem for most setups. But that's not really fixing the bug, it's just making the bug less likely to ever manifest itself. A proper solution to the problem is the fix below.

The Fix:
--------
In file "lupin-helpers", replace this line:
         try_mount "$devname" "$mountpoint" "$mountoptions" || return 1
...with the following:
         mount -o $mountoptions $devname $mountpoint || true

I tested that by repackaging the initrd.lz, remastering the iso file, and booting directly from it.
It works flawlesly, as far as I can see.

Even Better:
------------
It was nice when Ubuntu implemented the "iso-scan/filename=" boot option. But that's only half the problem with finding a file. The knowledge about which device the file is on is also needed.

So, it would be really nice if the bootloader would tell the OS which partition was currently active at the time. Let's say, for example, my Grub script would pass, a "grub2-root=" parameter, like this:

  linux ${someplace}/casper/vmlinuz grub2-root=${root} iso-scan/filename=${isofile} ${other_boot_options}
  initrd ${whatever}/casper/initrd.lz

When the OS takes over, it would find in "/proc/cmdline" BOTH the device AND the path.
(In my example, the OS would see "... grub2root=hd2,4 iso-scan/filename=/path/to/my/distro.iso ...")

That way it won't be necessary to start that whole messy search loop through all the partitions on all the block devices. The OS can go straight to mounting that partition, (in this example, /dev/sdc4) and using the path on it. Cleaner and faster boot!

Of course, if for some reason that path is not found on that device, THEN the init script could go through that same loop as before. But definitely without raising any panic from within the loop! And if it goes through the whole loop without finding the needed path, only then would be the proper time to panic.

Yours,
Purko Balkanski
(ivan at y3klimousine dot com)

Note: I suggested the name "grub2root=" to avoid possible confusion coming from the way legacy-grub names its partitions. (Legacy-grub would have named that same partition in the example as "hd2,3"). For completeness, we could also have the OS recognize a "legacy-grub-root=" parameter.