LVM setup fails to install grub on virtio storage

Bug #1838525 reported by Paride Legovini on 2019-07-31
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
debian-installer (Debian)
New
Unknown
debian-installer (Ubuntu)
Undecided
Unassigned
grub2 (Ubuntu)
Critical
Rafael David Tinoco
lvm2 (Ubuntu)
Undecided
Unassigned

Bug Description

[Impact]

 * Any Eoan installation that depends on latest installer will face this issue when final user chooses LVM full disk partitioning type.

 * Grub won't be able to install due to bad bootdevice variable in the installer. It will try to install grub to "/dev/mapper" and will fail. The default boot option will also be "/dev/mapper".

[Test Case]

 * Download netboot files from current installer (vmlinuz and initrd files).

 * Create a KVM guest running from these files, with a NIC connected to the internet.

 * Initiate a network installation inside the KVM guest, choosing the Entire Disk - LVM partitioning option.

 * Wait installation to finish and to start the grub-install phase. It will ask where to install grub, having, as default, "/dev/mapper". By default, it might simply try to grub-install /dev/mapper, which will also fail.

 * That happens because /dev/disk/by-id/ has an unexpected (by the installer) symlink added by lvm2 package that grub-installer (used by debian-installer) does not expect (when using grub-mkdevice command).

[Simplified Test Case]

 * To add a PV + VG + LVM in a KVM guest to an empty virtio disk, for example, and to check if the command:

grub-mkdevicemap --no-floppy -m -

lists /dev/vdX1 in front of /dev/vdX. This will be a sign that:

/dev/disk/by-id/*lvm* file exists and will be enough to confuse installer.

[Regression Potential]

There are 3 alternatives to fix this and I have chosen the one I believe has the smaller potential for any type of regression. Comment #30 describes what caused the regression and these 3 alternatives:

(1) To revert this change for current release, since this rule was added to "make navigation a bit easier using PV UUIDs", as the commit says. We would worry about installer changes in the next release.

(2) Another possibility would be to change the logic inside "grub-mkdevicemap.c: make_device_map()->grub_util_iterate_devices()" to ignore all symlinks from /dev/disk/by-id/ containing lvm-pv-uuid-*. We would not have to worry about this in the next release if using debian-installer.

(3) Another option would be to change grub-installer package/logic. Unfortunately, a few days before the full freeze, I don't think messing with the installer would be a good option to avoid regressions (potential regression item would grow in significance).

=> I'm choosing (2) because ubuntu foundations already faced a similar situation, when grub-mkdevicemap.c file was removed from grub2 code and they re-added it by using a quilt patch, assuming it was the easiest and better to maintain. I'm doing something similar, patching the patch that creates grub-mkdevicemap.c file again to ignore /dev/disk/by-id/lvm-pv-uuid-* files (like it already does for other symlinks, actually).

[Other Info]

Comment #26 has the TL;DR version of the problem.
https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/1838525/comments/26

[Original Description]

The Eoan debian-installer ISO fails to install GRUB on LVM installs with virtio storage, as it runs grub-install with /dev/mapper as a target (a directory), even if instructed to target a device.

The following steps to reproduce have been prepared running the 20190730 build, but this has been broken since about June 18, 2019. Steps to reproduce:

$ md5sum eoan-server-amd64.iso
f591e30485e5f0b5117f6c116e538c42 eoan-server-amd64.iso
$ qemu-img create -f raw disk1.img 8G
Formatting 'disk1.img', fmt=raw size=8589934592
$ kvm -m 1024 -boot d -cdrom eoan-server-amd64.iso -drive file=disk1.img,if=virtio

Proceed with all the defaults. In the "Partition disk" step select "Guided - use entire disk and set up LVM". Go ahead accepting the defaults. At the "Install the GRUB boot loader" step select "/dev/vda" as the target device. The installer will actually run `grub-install --force /dev/mapper` and fail after a while. The wrong command is visible both in the d-i screen and by running `ps` on a different console.

Full installer syslog: http://paste.ubuntu.com/p/qtZy86dTp6/

It's interesting how this doesn't happen when not using virtio. If from the commands above the "if=virtio" option is dropped then everything works as expected. In this case the target block device is called /dev/sda instead of /dev/vda.

Related branches

Paride Legovini (paride) on 2019-09-24
tags: added: rls-ee-incoming
Dimitri John Ledkov (xnox) wrote :

1) doing LVM all in one install manually, with BIOS boot, pops up the question of where to install the bootloader to.... when it shouldn't ask it
2) when selecting it tries to do 'grub-install /dev/mapper' which is bogus
3) manually running 'grub-install /dev/vda' works and forcing the install to continue without installer works and reboots fine
4) maybe this is related to change of default lvm name (now vgubuntu) or change in lvm
5) i think lvm and udev now want bind-mounted /dev and /run, not sure if that is happening or not

This needs escalation.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in debian-installer (Ubuntu):
status: New → Confirmed
Changed in grub-installer (Ubuntu):
status: New → Confirmed
Dimitri John Ledkov (xnox) wrote :

New grub-installer needs to migrate, then new isos should be spun up, then things should be ok. Hopefully.

Changed in grub-installer (Ubuntu):
status: Confirmed → Fix Committed
Changed in grub-installer (Ubuntu):
importance: Undecided → Critical
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub-installer - 1.128ubuntu13

---------------
grub-installer (1.128ubuntu13) eoan; urgency=medium

  [ Cherrypick 1.165 Colin Watson ]
  * On Linux, mount/unmount /run to work around #918590. LP: #1838525

 -- Dimitri John Ledkov <email address hidden> Wed, 25 Sep 2019 16:51:02 +0100

Changed in grub-installer (Ubuntu):
status: Fix Committed → Fix Released
Changed in debian-installer (Ubuntu):
status: Confirmed → Invalid
Changed in debian-installer (Debian):
status: Unknown → New
Dimitri John Ledkov (xnox) wrote :

Above was not enough, install still fails.

Changed in grub-installer (Ubuntu):
status: Fix Released → Triaged
Changed in debian-installer (Ubuntu):
status: Invalid → New
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/1838525

tags: added: iso-testing

Was able to reproduce, cdebconf shows:

----

Name: grub-installer/bootdev
Template: grub-installer/bootdev
Value: /dev/mapper
Owners: grub-installer

Name: grub-installer/grub-install-failed
Template: grub-installer/grub-install-failed
Owners: grub-installer
Variables:
 BOOTDEV = /dev/mapper

Name: grub-installer/progress/step_install_loader
Template: grub-installer/progress/step_install_loader
Owners: grub-installer
Variables:
 BOOTDEV = /dev/mapper

Name: nobootloader/confirmation_common
Template: nobootloader/confirmation_common
Owners: nobootloader
Variables:
 ROOT = root=/dev/mapper/vgubuntu-root
 BOOT = /dev/mapper/vgubuntu-root
 KERNEL = /boot/vmlinuz

----

this debconf template comes from grub-installer package, using BOOTDEV variable (which is wrongly set to /dev/mapper) from:

grub-installer file:

db_progress STEP 1
db_subst grub-installer/progress/step_install_loader BOOTDEV "$bootdev"
db_progress INFO grub-installer/progress/step_install_loader

I'll do the same logic - as this shell script - in installation console to check why "default_bootdev" and "bootdev" are being wrongly set.

When executing:

/usr/bin/grub-installer /target

From install environment, the following part:

if [ "$bootdev" != "dummy" ] && [ ! "$frdev" ]; then
        # check for a preseeded value
        db_get grub-installer/bootdev || true
        if [ -n "$RET" ] ; then
                bootdev="$RET"
        fi
fi

is the one responsible for getting the bad value.

Which means that something else populated grub-installer/bootdev wrongly.

Checking...

When zeroing debconf value, I was able to see that grub-installer populates it again (grub-installer/bootdev with the wrong value "/dev/mapper"). The code that db_sets it again is:

    elif [ "$(device_to_disk "$cdsrc")" = "$default_bootdev" ] || \
       ([ -n "$hdsrc" ] && [ "$(device_to_disk "$hdsrc")" = "$default_bootdev" ]) || \
       ([ "$default_bootdev" = '(hd0)' ] && \
        (([ -n "$cdfs" ] && [ "$cdfs" != "iso9660" ]) || \
         [ "$hybrid" = true ])) || \
       ([ "$default_bootdev" != '(hd0)' ] && \
        ! partmap "$default_bootdev" >/dev/null && \
        ! grub_probe -t fs -d "$default_bootdev" >/dev/null); then
        db_fget grub-installer/bootdev seen
        if [ "$RET" != true ]; then
            bootfs=$(findfs /boot)
            [ "$bootfs" ] || bootfs="$(findfs /)"
            disk=$(device_to_disk "$bootfs")
            db_set grub-installer/bootdev "$disk" ---------- HERE
            state=2
        fi
    fi
    ;;

The values for:

default_bootdev=/dev/vda1
bootfs=/dev/mapper/vgubuntu-root
disk=/dev/mapper

in that code part. Which means, disk is wrong:

disk=$(device_to_disk "$bootfs")

coming from:

# This should probably be rewritten using udevadm or similar.
device_to_disk () {
    echo "$1" | \
        sed 's:\(/dev/\(cciss\|ida\|rs\)/c[0-9]d[0-9][0-9]*\|/dev/mmcblk[0-9]\|/dev/nvme[0-9][0-9]*n[0-9][0-9]*\|/dev/\(ad\|ada\|da\|vtbd\|x bd\)[0-9]\+\|/dev/[hms]d[0-9]\+\|/dev/[a-z]\+\).*:\1:'
}

I'll brb to fix this.

Reproducer:

#!/bin/bash

device_to_disk () {
    echo "$1" | \
        sed 's:\(/dev/[a-z]\+\).*:\1:'
}

device_to_disk /dev/mapper/vgubuntu-root

So device_to_disk() is doing what it is supposed to do and there was NO CHANGE to it (as it appears).

I guess using grub-install at /dev/vda was allowed before (instead of the LVM PV ?), so there wasn't a need to use /dev/mapper/vgubuntu-root as the argument for grub-installer.

It is either that OR the "/dev/mapper" section of this same script, related to:

case $prefix in
...
    /dev/mapper)
    disc_offered_devfs=...
...

couldn't deal with virtio disks on top of LVM (my next task here) as it should.

Simply using:

----

#!/bin/bash

device_to_disk () {
    echo "$1" | \
        sed 's:\(/dev/mapper/.*\+\|/dev/[a-z]\+\).*:\1:'
}

device_to_disk /dev/mapper/vgubuntu-root
device_to_disk /dev/something/else

$ ./temp.sh
/dev/mapper/vgubuntu-root
/dev/something
----

instead, keeps the same behavior for everything else AND fixes usage case where device is "/dev/mapper/XXXXX". This is ONE WAY of fixing this, BUT I guess the /dev/mapper section should also work for LVM w/ virtio disks, as stated in the script, so I'm investigating this.

## SUMMARY OF THE ISSUE ##

After reading the grub-installer script, and following execution flow:

...

if [ "$found" = "0" ] && type lvdisplay >/dev/null 2>&1 && \
     (lvdisplay "$disc_offered" | grep -q 'LV Name' 2>/dev/null || \
      [ -e "$(dirname "$disc_offered")/control" ]); then
 # Don't set frdev/frdisk here, otherwise you'll end up in different
 # code paths below ...
 frtype=lvm
fi

...

default_bootdev_os="$($chroot $ROOT grub-mkdevicemap --no-floppy -m - | head -n1 | cut -f2)"
if [ "$default_bootdev_os" ]; then
 default_bootdev="$($chroot $ROOT readlink -f "$default_bootdev_os")"
 if db_get grub-installer/bootdev && [ "$RET" = '(hd0)' ]; then
  db_set grub-installer/bootdev "$default_bootdev"
 fi
else
 default_bootdev="(hd0)"
fi

DEBUG:

default_bootdev_os=/dev/disk/by-id/lvm-pv-uuid-IKvzNq-IRE5-arLC-hSS1-jDU2-ZSKA-90lATY
and bootdev wasn't set to anything because it isn't '(hd0)'.

...

elif [ "$(device_to_disk "$cdsrc")" = "$default_bootdev" ] || \
   ([ -n "$hdsrc" ] && [ "$(device_to_disk "$hdsrc")" = "$default_bootdev" ]) || \
   ([ "$default_bootdev" = '(hd0)' ] && \
    (([ -n "$cdfs" ] && [ "$cdfs" != "iso9660" ]) || \
     [ "$hybrid" = true ])) || \
   ([ "$default_bootdev" != '(hd0)' ] && \
    ! partmap "$default_bootdev" >/dev/null && \
    ! grub_probe -t fs -d "$default_bootdev" >/dev/null); then
 db_fget grub-installer/bootdev seen
 if [ "$RET" != true ]; then
  bootfs=$(findfs /boot)
  [ "$bootfs" ] || bootfs="$(findfs /)"
  disk=$(device_to_disk "$bootfs")
  db_set grub-installer/bootdev "$disk"
  state=2
 fi
fi

DEBUG:

bootdev=/dev/mapper after disk=$(device_to_disk bootfs)
since bootfs is bootfs=/dev/mapper/vgubuntu-root
and device_to_disk will remove last part from /dev/mapper/XXX.

bootfs is /dev/mapper/vgubuntu-root BECAUSE:

rootfs=$(findfs /)
bootfs=$(findfs /boot
[ -n "$bootfs" ] || bootfs="$rootfs"

so, device_to_disk() has cut "vgubuntu-root" from the string and that is how bootdev has become "/dev/mapper" only.

----

Testing with a SCSI disk, I got the exact same DEBUG output, BUT ...

I was able to see that the INITIAL value for:

grub-installer/bootdev WAS "/dev/sda"

but, after purging it with "debconf-set grub-installer/bootdev ''", I could NOT repopulate db variable again by executing only "grub-installer /target", DIFFERENTLY than when executing "grub-installer /target" in a /dev/vda based LVM, when it ALWAYS gets populated like described above (with /dev/mapper).

TODO:

discover why grub-installer/bootdev isn't populated back when re-executing "grub-installer /target" after purging the original (right after installation) value. Whenever bootdev defaults to /dev/mapper/XXXX, bootdev should be replaced with /dev/sdXXX or /dev/vdXXX, one of the LVM PV disks of under laying volume group (to confirm).

I've got 3 installation questions files:

1) LVM on top of /dev/vda[1] device (with failed grub-install run).
2) LVM on top of /dev/sda[1] device (without grub-install run).
3) LVM on top of /dev/sda[1] device (with successful grub-install run).

Differences between (1) and (2):

https://pastebin.ubuntu.com/p/scJXH3kTqM/

Differences between (1) and (3):

https://pastebin.ubuntu.com/p/6dB2yzsqMd/

Being most important ones:

Name: debconf/priority -> medium in (1) and nothing in (2) and (3).
Name: grub-installer/bootdev -> /dev/mapper in (1) due to previous comments.
Name: grub-installer/only_debian -> true in (2) and (3).

Hrm, no much output of long debugging :-/
Failing to see where/why/how the difference dconf choices are populated differently for sata-vs-virtio I started to compare things with disco ...

As expected it works there and grub-installer is called with /dev/vda as argument.

Next I went into comparing the d-i components between disco and eoan but other than the name change in bug 1782507 nothing seemed too related. And this change came after this bug here reported (per paride since June 18, 2019)

Next I analyzed grub-probe behavior in the three environments eoan-sata, eoan-virtio and disco-virtio.
But this part always behaves the same reporting the volume group.

So it really seems to come down finding why d-i populated the values differently as Rafael already assumed.

P.S. along the debugging I wondered for further work about d-i coming in components. Is there a good way/scripts that one could use exchange those e.g. the older/newer partman-lvm and then rebuild the ISO?

Andreas Hasenack (ahasenack) wrote :

Do we have a sh -x run of grub-installer with lvm + virtio and lvm + sata?

Andreas Hasenack (ahasenack) wrote :

set -x of grub-installer in the virtio (failing) case

Andreas Hasenack (ahasenack) wrote :

set -x of grub-installer in the non-virtio (working) case

tags: added: id-5d8cf66b49519e737f6e6856

Thanks a lot for those 2 files, I compared side by side and it called my attention:

https://pastebin.ubuntu.com/p/VJXrJHsqRB/

this difference:

+ partmap /dev/vda1 | + partmap /dev/sda
Error: /dev/vda1: unrecognised disk label <

Which Im currently investigating (and why it picked /dev/vda1 instead of /dev/vda like it should). That, likely, triggered the logic to keep select_bootdev as "/dev/mapper":

+ log 'debug: select_bootdev: arg='"'"'/dev/mapper'"'" | + log 'debug: select_bootdev: arg='"''"

After Andreas and I compared side by side the 2 "set -x" outputs, we discovered what is happening.

https://pastebin.ubuntu.com/p/WyvnChrqBQ/

And "/dev/vda1" is being caught instead of "/dev/vda", as it happens to "/dev/sda" (instead of getting "/dev/sda1"). This logic comes a bit above that execution output:

https://pastebin.ubuntu.com/p/ZRNHfnmJ4P/

And this is why andreas talks about "grub-mkdevicemap" here, and he is triggered by the "head -n1" usage:

17:10 <rafaeldtinoco> instead of /dev/vda
17:10 <andreas> cyphermox: I wonder about this head -n1
17:10 <andreas> default_bootdev_os="$($chroot $ROOT grub-mkdevicemap --no-floppy -m - | head -n1 | cut -f2)"
17:10 <andreas> if it's making an assumption
17:10 <cyphermox> well, usually hd0 is really what you want
17:10 <cyphermox> and it works for sda
17:10 <cyphermox> I mean, what is so different about virtio?
17:10 <andreas> yeah, I don't have the raw output in the sda case

We then tested both, virtio and non-virtio installations and saw that, for the SCSI case:

bash-5.0# grub-mkdevicemap --no-floppy -m -
(hd0) /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0
(hd1) /dev/disk/by-id/lvm-pv-uuid-oKqHJy-nRnC-nLm3-AuxT-FmpR-CCox-ltoWeB

the disk is listed first (as hd0) in grub-mkdevicemap, while in the VIRTIO case, it is listed as SECOND (explaining why /dev/mapper was generated in subsequent wrong logic bellow).

https://pastebin.ubuntu.com/p/hCNPH8GjgS/

Download full text (6.8 KiB)

17:45 <andreas> cyphermox: I tried with a serial number for vda, it failed in the same way
17:46 <cyphermox> andreas: not necessarily a serial
17:46 <andreas> isn't the serial what allows the by-id symlink to be created?
17:46 <rafaeldtinoco> andreas: that was my previous summary
17:46 <rafaeldtinoco> in the case
17:46 <rafaeldtinoco> (about the dbconfig variable)
17:46 <andreas> k
17:47 <rafaeldtinoco> SO
17:47 <rafaeldtinoco> device-map is gone in eoan =) there is a patch restoring it
17:47 <rafaeldtinoco> let me check previous versions :\
17:47 <rafaeldtinoco> https://www.irccloud.com/pastebin/h3c9ZFYP/
17:48 <andreas> these are the disk/by-id links in both cases: https://people.canonical.com/~andreas/disk-by-id.png
17:48 <andreas> cyphermox: ^
17:48 <andreas> there is no symlink to the whole vda device
17:48 <andreas> which is what you meant probably
17:49 <rafaeldtinoco> andreas: this is coming from udevadm DEVLINKS most likely
17:49 <andreas> but this is on a bionic host, where qemu/libvirt doesn't generate a serial for you automatically
17:50 <rafaeldtinoco> #ifdef __linux__
17:50 <rafaeldtinoco> {
17:50 <rafaeldtinoco> DIR *dir = opendir ("/dev/disk/by-id");
17:50 <andreas> in the straight-to-virtio case it worked flawlessly, bootdev was /dev/vda
17:51 <rafaeldtinoco> from deviceiter.c -> grub_util_iterate_devices
17:51 <andreas> and there is no symlink in /dev/disk/by-id
17:51 <rafaeldtinoco> used by grub-mkdevicemap.c
17:51 <rafaeldtinoco> which is inexistent in recent grub2 upstream
17:51 <rafaeldtinoco> we basically recreate the grub-mkdevicemap again
17:51 <rafaeldtinoco> so we can rely on it for our scripts
17:51 <rafaeldtinoco> so deviceiter.c explains the "sort" for that list
17:52 <rafaeldtinoco> /* Now add all the devices in sorted order. */
17:52 <rafaeldtinoco> for (dev = 0; dev < devs_len; ++dev)
17:52 <rafaeldtinoco> =)
17:52 <andreas> so if you look at this comparison: https://people.canonical.com/~andreas/disk-by-id.png
17:52 <andreas> that matches what that d-i prompt is asking us, and the output of grub-mkdevicemap with some removed/filtered
17:53 <rafaeldtinoco> you are getting dm-0 first
17:53 <andreas> that one is filtered
17:53 <rafaeldtinoco> instead of sda (left side)
17:53 <andreas> by "something"
17:53 <rafaeldtinoco> ah ok
17:53 <andreas> left side, sda, then sda1
17:53 <andreas> right side, I get vda1
17:53 <rafaeldtinoco> we have no vda
17:53 <rafaeldtinoco> at the right side
17:53 <andreas> yet it's in the menu: https://people.canonical.com/~andreas/which-dev-is-first.png
17:54 <andreas> but with no "(extra description)"
17:54 <rafaeldtinoco> i mean in /dev/disk/by-id
17:54 <rafaeldtinoco> where is vda only
17:54 <andreas> right, no vda there, probably because it has no serial
17:54 <andreas> I don't have screenshots for the case with a serial now
17:54 <andreas> so many combinations to try
17:54 <rafaeldtinoco> no serial because its not SCSI
17:54 <andreas> yes
17:54 <rafaeldtinoco> no VPDs
17:54 <rafaeldtinoco> now it all makes sense
17:54 <andreas> qemu on eoan assigns a serial
17:55 <rafaeldtinoco> let me copy this to the public bug =)
17:55 <andreas> 00001 or something
17:55 <andreas> so y...

Read more...

Andreas Hasenack (ahasenack) wrote :

Looks like we are narrowing this down to a virtio device with no serial number. On a bionic host, if I manually add a serial number to the virtio device (in virt-manager, just before starting the installation, for example) then this works.

Andreas Hasenack (ahasenack) wrote :

/dev/disk/by-id contents in three cases: lvm without virtio, lvm with virtio, and lvm with virtio and a serial number for vda

Andreas Hasenack (ahasenack) wrote :

boot device prompt in three cases: lvm without virtio, lvm with virtio, and lvm with virtio and a serial number for vda

Andreas Hasenack (ahasenack) wrote :

The lvm with virtio and a serial number case worked, i.e, the installation finished, and grub was installed on /dev/vda.

SUMMARY Of the problem (or a HUGE TL;DR):

- Installer depends on "grub-mkdevice --no-floppy -m -" command to get bootable devices ordering.
- grub-mkdevice was dropped upstream and it is included in grub2 by a quilt patch.
- grub-mkdevice orders everything that is in /dev/disk/by-id/* excluding, in this order, everything containing "-part", "dm-" and "md-".
- LVM partitions are added to /dev/disk/by-id, but not the entire disk (as the PV is the partition itself).
- UDEV creates /dev/disk/by-id depending on 60-persistent-storage.rules:

# virtio-blk
KERNEL=="vd*[!0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}"
KERNEL=="vd*[0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}-part%n"

- So, LVM puts ID_SERIAL in LVM partitions, they get added to /dev/disk/by-id and installer is lost when trying to order it, as LVM partition gets into 1st position of choice, instead of the full disk (for hd0, hd1, ... grub setup).

description: updated
description: updated
Download full text (5.3 KiB)

xenial - does not ask device to grub-install, finds /dev/vda correctly

~ # pvs
  PV VG Fmt Attr PSize PFree
  /dev/vda5 ubuntu-vg lvm2 a-- 29.28g 40.00m

~ # vgs
  VG #PV #LV #SN Attr VSize VFree
  ubuntu-vg 1 2 0 wz--n- 29.28g 40.00m

~ # lvs
  LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  root ubuntu-vg -wi-ao---- 28.29g
  swap_1 ubuntu-vg -wi-ao---- 976.00m

~ # ls -lah1 /dev/disk/by-id/
drwxr-xr-x 2 120 Oct 2 17:54 .
drwxr-xr-x 5 100 Oct 2 18:05 ..
lrwxrwxrwx 1 10 Oct 2 18:07 dm-name-ubuntu--vg-root -> ../../dm-0
lrwxrwxrwx 1 10 Oct 2 18:05 dm-name-ubuntu--vg-swap_1 -> ../../dm-1
lrwxrwxrwx 1 10 Oct 2 18:07 dm-uuid-LVM-rKxFFLzSxoQE5R09qa8ztKdfGKZIxktySnE1WunTpz1cKQt8elMkWyzXF25Dhqgt -> ../../dm-0
lrwxrwxrwx 1 10 Oct 2 18:05 dm-uuid-LVM-rKxFFLzSxoQE5R09qa8ztKdfGKZIxktyoOA5IYfjtokmG0cVkmlrLRZfr0gbBZCh -> ../../dm-1

/lib/udev/rules.d # grep -ril lvm *
55-dm.rules
60-persistent-storage-dm.rules

----

bionic - does not ask device to grub-install, finds /dev/vda correctly

  PV VG Fmt Attr PSize PFree
  /dev/vda1 ubuntu-vg lvm2 a-- <30.00g 12.00m

~ # vgs
  VG #PV #LV #SN Attr VSize VFree
  ubuntu-vg 1 2 0 wz--n- <30.00g 12.00m

~ # lvs
  LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  root ubuntu-vg -wi-ao---- 29.03g
  swap_1 ubuntu-vg -wi-ao---- 976.00m

~ # ls -lah1 /dev/disk/by-id/
drwxr-xr-x 2 root root 120 Oct 2 18:10 .
drwxr-xr-x 6 root root 120 Oct 2 18:27 ..
lrwxrwxrwx 1 root root 10 Oct 2 18:27 dm-name-ubuntu--vg-root -> ../../dm-0
lrwxrwxrwx 1 root root 10 Oct 2 18:27 dm-name-ubuntu--vg-swap_1 -> ../../dm-1
lrwxrwxrwx 1 root root 10 Oct 2 18:27 dm-uuid-LVM-tuUboqieQsJtu1wB4OyTbAT5pRwXxZj7C8q7wogJHSUaOrIBRvJ2GL7ZcRFz4nCe -> ../../dm-0
lrwxrwxrwx 1 root root 10 Oct 2 18:27 dm-uuid-LVM-tuUboqieQsJtu1wB4OyTbAT5pRwXxZj7d1U7ckfbgo1bRtQnDa5XQsaoXSpjCvfV -> ../../dm-1

/lib/udev/rules.d # grep -ril lvm *
55-dm.rules
60-persistent-storage-dm.rules
95-dm-notify.rules

----

disco - does not ask, but has only /dev/vda to choose

~ # pvs
  PV VG Fmt Attr PSize PFree
  /dev/vda1 ubuntu-vg lvm2 a-- <30.00g 12.00m

~ # vgs
  VG #PV #LV #SN Attr VSize VFree
  ubuntu-vg 1 2 0 wz--n- <30.00g 12.00m

~ # lvs
  LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  root ubuntu-vg -wi-ao---- 29.03g
  swap_1 ubuntu-vg -wi-ao---- 976.00m

~ # ls -lah1 /dev/disk//by-id/
drwxr-xr-x 2 root root 120 Oct 2 18:09 .
drwxr-xr-x 6 root root 120 Oct 2 18:27 ..
lrwxrwxrwx 1 root root 10 Oct 2 18:27 dm-name-ubuntu--vg-root -> ../../dm-0
lrwxrwxrwx 1 ...

Read more...

Now we will dig into:

lvm2: /lib/udev/rules.d/69-lvm-metad.rules

and check if that udev rule is really needed (what for ?) and then we'll know what to fix: the installer (grub-mkdevicemap logic) or lvm2 itself (udev rules).

no longer affects: grub-installer (Ubuntu Eoan)
no longer affects: debian-installer (Ubuntu Eoan)
Changed in grub-installer (Ubuntu):
status: Triaged → In Progress
Changed in lvm2 (Ubuntu):
status: New → In Progress
Changed in debian-installer (Ubuntu):
status: New → In Progress
Changed in lvm2 (Ubuntu):
importance: Undecided → Critical
Changed in debian-installer (Ubuntu):
importance: Undecided → Critical
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in grub-installer (Ubuntu):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in lvm2 (Ubuntu):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
no longer affects: grub-installer (Ubuntu)
Changed in grub2 (Ubuntu):
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Download full text (3.5 KiB)

The following commit:

commit 417e52c13a8156b11c25c411d44bda8b32bf87e4
Author: Peter Rajnoha <email address hidden>
Date: Tue Feb 18 07:27:21 2014

    udev: create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for a PV

    We already have /dev/disk/by-id/dm-uuid-... (which encompasses the
    VG UUID and LV UUID in case of LVs since the mapping's UUID is
    VG+LV UUID together) and /dev/disk/by-id/dm-name-... (which encompasses
    the VG and LV name in case of LVs).

    This patch addds /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> that completes
    this scheme and makes navigation a bit easier using PV UUIDs since
    one can navigate using PV UUIDs only and there's no need to do extra
    PV UUID <--> kernel name matching (the PV UUID is stable across reboots).
    This may come in handy in various scripts.

    Since we already have the PV UUID stored in udev database (as a result
    of blkid call - returned in ID_FS_UUID blkid's variable), this operation
    is very cheap indeed, just creating the extra one symlink.

diff --git a/WHATS_NEW b/WHATS_NEW
index 39e8b886a..5bb37d8ad 100644
--- a/WHATS_NEW
+++ b/WHATS_NEW
@@ -1,5 +1,6 @@
 Version 2.02.106 -
 ====================================
+ Create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for each PV via udev.
   lvcreate computes RAID4/5/6 stripes if not given from # of allocatable PVs.
   Fix merging of old snapshot into thin volume origin.
   Use --ignoreskippedcluster in lvm2-monitor initscript/systemd unit.
diff --git a/udev/69-dm-lvm-metad.rules.in b/udev/69-dm-lvm-metad.rules.in
index e8304b5e0..bd75fc8ef 100644
--- a/udev/69-dm-lvm-metad.rules.in
+++ b/udev/69-dm-lvm-metad.rules.in
@@ -34,6 +34,9 @@ ENV{DM_MULTIPATH_DEVICE_PATH}=="1", GOTO="lvm_end"
 # Inform lvmetad about any PV that is gone.
 ACTION=="remove", GOTO="lvm_scan"

+# Create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for each PV
+ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-id/lvm-pv-uuid-$env{ID_FS_UUID_ENC}"
+
 # If the PV is a special device listed below, scan only if the device is
 # properly activated. These devices are not usable after an ADD event,
 # but they require an extra setup and they are ready after a CHANGE event.

was the one responsible to add the change that causes:

/dev/disk/by-id/ to be populated with the LVM volume (not the entire disk), breaking the current installer logic. There are 2 alternatives here:

 To revert this change for current release, since this rule was added to "make navigation a bit easier using PV UUIDs", as the commit says. We would worry about installer changes in the next release.

 Another possibility would be to change the logic inside "grub-mkdevicemap.c: make_device_map()->grub_util_iterate_devices()" to ignore all symlinks from /dev/disk/by-id/ containing lvm-pv-uuid-*. We would not have to worry about this in the next release if using debian-installer. I'm choosing this option because ubuntu foundations already faced a similar situation, when grub-mkdevicemap.c file was removed from grub2 code and they re-added it by using a quilt patch, assuming it was the easiest and better to maintain. I'm doing something similar, patching the patch that crea...

Read more...

description: updated
description: updated

Verifying the fix from the PPA:

(k)rafaeldtinoco@keoan:~$ ls /dev/disk/by-id
ls: cannot access '/dev/disk/by-id': No such file or directory

(k)rafaeldtinoco@keoan:~$ sudo grub-mkdevicemap --no-floppy -m -
(hd0) /dev/vda
(hd1) /dev/vdb

This is the expected behavior (by installer).

(k)rafaeldtinoco@keoan:~$ sudo pvcreate /dev/vdb
  Physical volume "/dev/vdb" successfully created.

(k)rafaeldtinoco@keoan:~$ sudo vgcreate vgubuntu /dev/vdb
  Volume group "vgubuntu" successfully created

(k)rafaeldtinoco@keoan:~$ sudo lvcreate -l100%VG -n lvubuntu vgubuntu
  Logical volume "lvubuntu" created.

(k)rafaeldtinoco@keoan:~$ sudo grub-mkdevicemap --no-floppy -m -
(hd0) /dev/disk/by-id/lvm-pv-uuid-tdcf0A-qcPJ-1A1H-Oh2G-I1vE-GQPx-tjjeva
(hd1) /dev/vda

After installing the fix:

(k)rafaeldtinoco@keoan:~$ ls -1 /dev/disk/by-id
dm-name-vgubuntu-lvubuntu
dm-uuid-LVM-btjm85kImnUtxMqC6ObQmM64kO1Bop7Gv7v1qbZIaKsBfOB0uyFfC3cG37LaB3KN
lvm-pv-uuid-tdcf0A-qcPJ-1A1H-Oh2G-I1vE-GQPx-tjjeva

(k)rafaeldtinoco@keoan:~$ sudo grub-mkdevicemap --no-floppy -m -
(hd0) /dev/vda
(hd1) /dev/vdb

Back to same behavior!

description: updated
Paride Legovini (paride) wrote :

Thanks Rafael, Andreas and everybody for the great work done here! I successfully tested your fix as follows:

1. Followed the steps in the original description of this
   bug up the point where the installer tries to install
   grub to /dev/mapper and fails.

2. Replaced /target/usr/sbin/grub-mkdevicemap with the
   one extracted from the grub-common package in your PPA.

3. Retried the "install boot loader" installer step.

4. Success!

Funny how the original grub-mkdevicemap and your fixed version have the same size up to the byte.

Changed in grub2 (Ubuntu):
status: In Progress → Fix Committed
Changed in debian-installer (Ubuntu):
status: In Progress → Invalid
Changed in lvm2 (Ubuntu):
status: In Progress → Invalid
Changed in debian-installer (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in lvm2 (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in debian-installer (Ubuntu):
importance: Critical → Undecided
Changed in lvm2 (Ubuntu):
importance: Critical → Undecided
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.04-1ubuntu10

---------------
grub2 (2.04-1ubuntu10) eoan; urgency=medium

  * debian/patches/ubuntu-skip-disk-by-id-lvm-pvm-uuid-entries.patch:
    skip /dev/disk/by-id/lvm-pvm-uuid entries from device iteration.
    (LP: #1838525)

 -- Rafael David Tinoco <email address hidden> Mon, 07 Oct 2019 23:23:54 -0300

Changed in grub2 (Ubuntu):
status: Fix Committed → Fix Released
vmware-gos-Yuhua (yhzou) wrote :

Verified: this issue is fixed in GA iso

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.