linux-boot-prober yields wrong uuid for kernel root parameter

Bug #554307 reported by Matthias Müller-Reineke
64
This bug affects 21 people
Affects Status Importance Assigned to Milestone
os-prober (Ubuntu)
Triaged
High
Unassigned

Bug Description

When os-prober detects another install on a system with a shared /boot, it sets up a menu to boot that install, but passes the kernel a root= argument directing it to mount the root partition of the install running os-prober instead of the partition on which the other system was found. Example:

menuentry '<name of first linux OS> (on /dev/sda2)' [...] {
        [...]
        set root='(hd0,msdos1)' # <- this is the boot partition, /dev/sda1
        search --no-floppy --fs-uuid --set=root <UUID of /dev/sda1>
        linux /vmlinuz-x.x.x root=UUID=<UUID of /dev/sda2> [...]
        initrd [....]
}
menuentry '<name of second linux OS> (on /dev/sda3)' [...] {
        [...]
        set root='(hd0,msdos1)' # <- this is the boot partition, /dev/sda1
        search --no-floppy --fs-uuid --set=root <UUID-of-/dev/sda1>
        linux /vmlinuz-y.y.y root=UUID=<UUID of /dev/sda2 (!!!)> [...]
        initrd [....]
}

Revision history for this message
Matthias Müller-Reineke (matthias-mueller-reineke) wrote :
Revision history for this message
Peng Deng (d6g) wrote :

I think this bug is still there with maverick.

I have a fresh lucid and maverick installation sharing a boot partition. When I do a grub-update in maverick, the script which runs linux-boot-prober apparently got the uuid wrong for the lucid partition.

Changed in os-prober (Ubuntu):
status: New → Confirmed
Revision history for this message
Nicolas Krzywinski (nsk7even) wrote :

Anyone a clue what makes this bug happen?
I just copied my two systems to a new ssd and now I have this problem as well... but I think it worked well before!

It is easy to see even without comparing the uuids, as my old Mint installation has kernel 3.0.0-13 still:

nsk@sesta09:~$ sudo linux-boot-prober /dev/sda7
/dev/sda7:/dev/sda2:Ubuntu, mit Linux 3.2.0-38-generic:/boot/vmlinuz-3.2.0-38-generic:/boot/initrd.img-3.2.0-38-generic:root=UUID=8d314396-b909-4979-b8cb-848819698a14 ro quiet splash $vt_handoff
/dev/sda7:/dev/sda2:Ubuntu, mit Linux 3.2.0-38-generic (Wiederherstellungsmodus):/boot/vmlinuz-3.2.0-38-generic:/boot/initrd.img-3.2.0-38-generic:root=UUID=8d314396-b909-4979-b8cb-848819698a14 ro recovery nomodeset
/dev/sda7:/dev/sda2:Ubuntu, mit Linux 3.2.0-37-generic:/boot/vmlinuz-3.2.0-37-generic:/boot/initrd.img-3.2.0-37-generic:root=UUID=8d314396-b909-4979-b8cb-848819698a14 ro quiet splash $vt_handoff
/dev/sda7:/dev/sda2:Ubuntu, mit Linux 3.2.0-37-generic (Wiederherstellungsmodus):/boot/vmlinuz-3.2.0-37-generic:/boot/initrd.img-3.2.0-37-generic:root=UUID=8d314396-b909-4979-b8cb-848819698a14 ro recovery nomodeset
nsk@sesta09:~$ sudo blkid /dev/sda7
/dev/sda7: LABEL="s8_Mint" UUID="923674ed-1d30-4343-8d1c-4d66a4274790" TYPE="ext4"

Any hints?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Revision history for this message
Phillip Susi (psusi) wrote :

Are you able to reproduce this on a currently supported release? What does the /etc/fstab on the partition in question say?

no longer affects: grub2 (Ubuntu)
Changed in os-prober (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
V[i]ctor (sadkov993) wrote :

Yes, the bug appears on 14.04 LTS release. The partition in question is not mounted via /etc/fstab.

To understand what's going on I'd better give you blkid output with some my comments:

$ blkid /dev/sda*
/dev/sda1: UUID="66ECBF91ECBF5A4F" TYPE="ntfs"
/dev/sda10: UUID="04db7581-6256-408c-969b-27847b90f5e3" TYPE="ext4" # Ubuntu 14.04, my main OS, all the scripts I run, I run here
/dev/sda2: UUID="425ec35d-cc19-438a-aa30-ea3b0beee6ae" TYPE="ext4" # /boot partition for both Ubuntu systems
/dev/sda5: UUID="f728b87c-d0fa-4419-a55a-aee66dead3c4" TYPE="ext4" # old Ubuntu 10.04, I can't boot it because of the current bug (but I fixed grub.cfg later on)
/dev/sda6: UUID="b862754b-5a40-4412-87ac-eb7e846be4bb" TYPE="ext4" # my /home for both Ubuntus
/dev/sda7: UUID="e1eb1b01-8c1a-4237-85ec-e69255f2a73e" TYPE="ext4"
/dev/sda8: UUID="b30c59b6-fc31-4a06-b04b-a13d9f898c1c" TYPE="swap"
/dev/sda9: UUID="70EFAAAD68BA7261" TYPE="ntfs"

I also found that linux-boot-prober uses grub.cfg to get the data.

I ran it 3 times: without grub.cfg in /boot/grub/, with the grub.cfg generated by update-grub script (which uses linux-boot-prober, that's where this bug affects grub2 package) and with fixed by hand grub.cfg. You can see all 3 outputs in the attached file. (I have partial russian locale, so cyrillic с there stands for the word "with".)

You may also want to see my grub.cfg file before and after my fix. The old one wasn't saved, so I regenerated it running update-grub several times (it's a huge mess there right now). Both fixed one and regenerated one are in attachment.

As far as I found out, linux-boot-prober parses grub.cfg assuming there are no other operating systems which use the same /boot partition. When it does so, it just finds all linux menuentries and prints all them out, with root partitions from the corresponding menuentries.

So the actual bug is, linux-boot-prober prints out the wrong kernels from grub.cfg, and the update-grub script uses this incorrect data from linux-boot-prober to generate messed up grub.cfg. In the update-grub script there is actually no bug, just not super-safe-checking-everything behaviour, so I'll leave the bug from grub2 as a duplicate.

Rune Philosof (olberd)
Changed in os-prober (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
EvilSupahFly (seann-giffin) wrote :

Yesterday, I installed KALI beside my existing Ubuntu 15.04, and I'm still seeing this issue. When I ran 'update-grub' from Ubuntu, all menu options boot Ubuntu but with problems. When I ran 'update-grub' in KALI, everything changes to KALI but also with problems.

I have one common dedicated boot partition which loads the respective Kali or Ubuntu partitions with Kali being on a completely different HD:

$ blkid
/dev/sda1: LABEL="BOOT" UUID="883242e7-e8da-40d2-aab7-40a2f771aa6b" TYPE="ext4" PARTUUID="fa43222a-01"
/dev/sda3: UUID="5bb6dc47-4b57-4610-9e4d-6d4e604a171f" TYPE="swap" PARTUUID="fa43222a-03"
/dev/sda4: LABEL="UBUNTU" UUID="dcd42ae2-281e-4101-9d64-fb0301c6eb37" TYPE="ext4" PARTUUID="fa43222a-04"
/dev/sdb1: LABEL="KALI" UUID="c5f8b7c2-82b6-4c22-a69e-6b4954ee5d5f" TYPE="ext4" PARTUUID="d60d1859-01"
/dev/sdc1: LABEL="HOME" UUID="8d6dae46-274c-4401-89dd-26c536549ebd" TYPE="ext4" PARTUUID="e7e0802f-01"

To fix this, I had to manually change the UUID's in /boot/grub to reflect Ubuntu and Kali userland respectively.

Revision history for this message
EvilSupahFly (seann-giffin) wrote :

I've updated to Ubuntu 15.10 and the bug still exists.

tags: added: trusty vivid wily
Changed in os-prober (Ubuntu):
importance: Undecided → High
Revision history for this message
My name (plmalternate) wrote :

Same thing with me in 14.04. This is a pretty basic thing to be broken for so long. Suggestions for work-arounds would be nice, since it doesn't look fixing it is imminent.

Personally, I've tried a 2 grub rescue utilities, one from within one of the partitions that will boot, and another from CD. Neither worked. I'm downloading 2 more to burn optical disks with, but I'm not hopeful.

So maybe:

-Edit grub.cfg by hand, or is there a better file to edit?
-Roll back to legacy, which was pretty darn stable and foolproof?
-Switch LILO?
-Switch to some exotic loader (isn't there a "Burg"?)?

Revision history for this message
My name (plmalternate) wrote :

BTW, all my linux systems are 1-partition-each, no seperate ~s, or swaps or boot. Keeping it simple. They are on 2 drives, one external, one internal. Grub is on /dev/sda with config files on sda5, which is the one that boots. Some are on primary and some on logical partitions. Sda1 and sda2 are Win-7. I have maybe 7 'nix systems, most but not all of which are 14.04s. And this happened after I defragged the W7 and shrunk its main partition (sda2)(what a collasol PITA - MS makes that hard deliberately, the farstards), put in a new sda6, restored an fsarchiver backup of sda5 to it, reset it to a new UUID, and did update-grub. When it didn't work, I used all the native tools to purge grub* and reinstall. dpkg kept returning errors which I couldn't fix with any dpkg or apt-get commands, despite hours of trying. To my surprise Synaptic had no trouble fixing that. But no matter how many times I purge and reinstall, it is the same issue now.

One more work-around I just thought of, in light of some of the comments above, that may be worth trying:

-After purging grub* and manually rm'ing any grub related files I can find on the system that installed grub to sda, which I already tried, go on and rm any grub related files I can fine on ALL partitions before reinstalling. Maybe?

Revision history for this message
EvilSupahFly (seann-giffin) wrote : Re: [Bug 554307] Re: linux-boot-prober yields wrong uuid for kernel root parameter

I had to edit by hand, remembering to re-edit after a kernel update as my
changes get wiped out by update-grub.

On Aug 1, 2016 17:16, "My name" <email address hidden> wrote:

> Same thing with me in 14.04. This is a pretty basic thing to be broken
> for so long. Suggestions for work-arounds would be nice, since it
> doesn't look fixing it is imminent.
>
> Personally, I've tried a 2 grub rescue utilities, one from within one of
> the partitions that will boot, and another from CD. Neither worked. I'm
> downloading 2 more to burn optical disks with, but I'm not hopeful.
>
> So maybe:
>
> -Edit grub.cfg by hand, or is there a better file to edit?
> -Roll back to legacy, which was pretty darn stable and foolproof?
> -Switch LILO?
> -Switch to some exotic loader (isn't there a "Burg"?)?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/554307
>
> Title:
> linux-boot-prober yields wrong uuid for kernel root parameter
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/os-prober/+bug/554307/+subscriptions
>

Revision history for this message
My name (plmalternate) wrote :

Thanks, EvilSupahFly. I can now add this:

Rm'ing every grub related file on the other systems, reinstalling grub, and running update-grub was REALLY BAD IDEA. Now when I try to boot /dev/sda6 (the system that is an fsarchiver-made clone of my main, /dev/sda5-based 14.04 system, where grub is supposed to live) all I get is BusyBox with a prompt something like "[initramfs]". So now my sda5 system is broken. But I can restore the fsa backup again easily enough.

It is possible, AFAIK from my own ignorance, that a modified approach like that MIGHT work, if you knew exactly WHICH grub related files to rm, and if anyone thinks that's the case, I'd appreciate their expanding on the topic, as it sounds to me as if, that if something like that would work at all it might be either something I'd only have to do once, or failing that, something that might be simple enough to write a script to do, and run that script every time I run dist-upgrade or restore an fsarchiver backup to a partition it wasn't on originally.

When you say manual editing works for you, I assume you mean editing grub.cfg on the system that should be the one where grub was last installed via a regular installation procedure (as opposed to restoring a clone with fsarchiver or Clonezilla). As for update-grub rewriting it incorrectly, I could just mark it read-only or take the x perm off update-grub, or alias update-grub to something innocuous. As for grub.cfg getting outdated and booting old kernels (or crashing if they've been removed), I seem to remember there is some way to make an entry that in effect says "boot from the latest one there, whatever it is". If I had a grub.cfg like that, could I just disable os-prober and update-grub, and not have to edit until I changed my partition setup?

Would having a seperate boot partition get me around all this and let grub work automatically?

If I have to manually edit grub.cfg every time a kernel is updated, or every time I install an OS or restore an fsa or clonezilla backup, would I be better off switching to a boot loader that was written with the INTENT that a boot configuration file would have to be manually maintained?

I've also tried the LILO approach since I posted. It does boot my main /dev/sda5 14.04 fine, but it didn't pick up on any of the other systems automatically. So I purged it with apt-get and reinstalled grub ("grub '2'") hoping I could get find the magic spell to make grub-2 work automatically, but so far I haven't. So, if I DO have to hand edit every time constantly, maybe I should revert that.

At any rate, anyone reading this looking for work arounds will now know at least one thing that does NOT work. Anyone with thoughts on how to set up a low maintenance boot loader, either with grub-2 or something else, that can cope with new OSs coming and going through normal installation and clone restoration, please share your ideas.

Meanwhile, I think I'm going to read up on boot partitions. Thanks for reading.

Revision history for this message
EvilSupahFly (seann-giffin) wrote :
Download full text (6.3 KiB)

Marking grub.cfg read only doesn't work - I tried that at first.

I make a backup copy of my grub.cfg before doing kernel updates, run the updates and update-grub as normal, then compare the changes to the backup, and manually re-edit where needed to ensure smooth booting - before I ever boot a new kernel, and so far, I haven't had any problems.

Part of my final solution ("Final Solution") was to create a dedicated boot partition, install grub2 to it, allow it to run update-grub to write the changes, run blkid to get the correct UUIDs, then manually edit grub.cfg to correct the UUIDs referenced incorrectly to the ones they should have been.

Here's my grub.cfg (in part), starting at about line 149, which I modified using the above approach, and which allows me to boot between Ubuntu and Kali, complete with the "safe mode" option for each:

set linux_gfx_mode=keep
export linux_gfx_mode
menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-883242e7-e8da-40d2-aab7-40a2f771aa6b' {
    recordfail
    savedefault
    load_video
    gfxmode $linux_gfx_mode
    insmod gzio
    if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
    insmod part_msdos
    insmod ext2
    set root='hd0,msdos4'
    if [ x$feature_platform_search_hint = xy ]; then
      search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos4 --hint-efi=hd0,msdos4 --hint-baremetal=ahci0,msdos4 dcd42ae2-281e-4101-9d64-fb0301c6eb37
    else
      search --no-floppy --fs-uuid --set=root dcd42ae2-281e-4101-9d64-fb0301c6eb37
    fi
    linux /vmlinuz-4.4.1-040401-generic root=UUID=883242e7-e8da-40d2-aab7-40a2f771aa6b ro crashkernel=384M-:128M
    initrd /initrd.img-4.4.1-040401-generic
}
submenu 'Advanced options for Ubuntu' $menuentry_id_option 'gnulinux-advanced-883242e7-e8da-40d2-aab7-40a2f771aa6b' {
    menuentry 'Ubuntu, with Linux 4.4.1-040401-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.1-040401-generic-advanced-883242e7-e8da-40d2-aab7-40a2f771aa6b' {
        recordfail
    savedefault
        load_video
        gfxmode $linux_gfx_mode
        insmod gzio
        if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
        insmod part_msdos
        insmod ext2
        set root='hd0,msdos4'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos4 --hint-efi=hd0,msdos4 --hint-baremetal=ahci0,msdos4 dcd42ae2-281e-4101-9d64-fb0301c6eb37
        else
          search --no-floppy --fs-uuid --set=root dcd42ae2-281e-4101-9d64-fb0301c6eb37
        fi
        echo 'Loading Linux 4.4.1-040401-generic ...'
        linux /vmlinuz-4.4.1-040401-generic root=UUID=883242e7-e8da-40d2-aab7-40a2f771aa6b ro crashkernel=384M-:128M
        echo 'Loading initial ramdisk ...'
        initrd /initrd.img-4.4.1-040401-generic
    }
    menuentry 'Ubuntu, with Linux 4.4.1-040401-generic (upstart)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.1-040401-generic-init-upstart-883242e7-e8da-40d2-aab7-40a2f771aa6b' {
        recordfai...

Read more...

Revision history for this message
TedM (btmcpher) wrote :

I installed 17.10 on my SSD and have in total 4 Ubuntu partitions (2 on hard drive and 2 on SSD). With the install, as best I can recall, everything was fine and I could boot both partitions on the SSD. I then updated Grub using Grub Customizer. The menu still shows 4 options for boot, but the two on the SSD have the same UUID so they start the same partition.

Revision history for this message
Phillip Susi (psusi) wrote :

os-prober can not support multiple installs with a shared /boot. How is it supposed to know which files belong to which install? To make this work you just have to manually configure grub.cfg.

Changed in os-prober (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Rüdiger Kupper (ruediger.kupper) wrote :

I agree with you that os-prober cannot automatically tell, which kernel files belong to which installation, if several linux installs share the same /boot.

(1) Still, its behaviour could be easily improved: If os-prober detects two linux installs with a shared /boot, it could simply ask the user, which install should boot which kernel. Instead it silently fails with no warning and the user ends up with a grub.cfg that fails booting one of the installs.

(2) The resulting grub.cfg is definitely inconsistent, even if os-prober cannot decide which kernel belongs to which install. To illustrate this, kindly have a look at my original bug report (which I have later marked a duplicate of this): Bug #554307
In the description of Bug #554307 you see the grub.cfg the user gets when running update-grub on two installs sharing /boot. Yes, os-prober cannot determine which install should boot which kernel. But this is not the problem I report there. The resulting grub.cfg looks like this:

[...]
menuentry '<name of first linux OS> (on /dev/sda2)' [...] {
        [...]
        set root='(hd0,msdos1)' # <- this is the boot partition, /dev/sda1
        search --no-floppy --fs-uuid --set=root <UUID of /dev/sda1>
        linux /vmlinuz-x.x.x root=UUID=<UUID of /dev/sda2> [...]
        initrd [....]
}
menuentry '<name of second linux OS> (on /dev/sda3)' [...] {
        [...]
        set root='(hd0,msdos1)' # <- this is the boot partition, /dev/sda1
        search --no-floppy --fs-uuid --set=root <UUID-of-/dev/sda1>
        linux /vmlinuz-y.y.y root=UUID=<UUID of /dev/sda2 (!!!)> [...]
        initrd [....]
}
[...]

Look at the *second* menuentry: Its *name* correctly refers to the *second* install (on /dev/sda3). Still its *linux call* refers to the *first* install an /dev/sda2. This is clearly inconsistent and a bug.

My original report in #554307 was about this bug. If you feel that this differs from the issues reported here, I will be unlisting this bug as a duplicate and reopen it.

Revision history for this message
Phillip Susi (psusi) wrote :

I see now. So os-prober detects an install on another partition, decides to try and boot it with some kernel, but passes the wrong root argument. That does seem wrong.

Changed in os-prober (Ubuntu):
status: Won't Fix → Triaged
description: updated
Revision history for this message
Rüdiger Kupper (ruediger.kupper) wrote :

Hi Philip,
glad to hear that I'm not fully mistaken ;-). It may not be a very frequent use case having different Linux installs in parallel, but solving this will really improve things for those users.
Thank you!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.