grub_dpkg writes wrong device into debconf

Bug #1993503 reported by Daniel Krambrock
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Medium
Unassigned
subiquity
New
Undecided
Unassigned

Bug Description

After auto-installing ubuntu 22.04 onto a LV on a mdraid 1 with two disks cc_grub_dpkg overrides the correct `grub-pc/install_devices` debconf entry with a false one on first boot:

```
~# debconf-show grub-pc | grep grub-pc/install_devices:
* grub-pc/install_devices: /dev/disk/by-id/dm-name-vg0-lv_root
```

This breaks grub updates.

Revision history for this message
Brett Holman (holmanb) wrote :

Hi Daniel,

Thanks for reporting this issue.

Could you please include logs[1] to help assist in resolving this bug?

I added the subiquity project to this bug since you mentioned autoinstall (subiquity passes a configuration to and runs cloud-init).

[1]https://cloudinit.readthedocs.io/en/latest/topics/bugs.html#collect-logs

Revision history for this message
Daniel Krambrock (danielky) wrote :

Hi Brett,

this involves subiquity since it seeds cloud-init through user-data. subiquity's fault is to not skip cc_grub_dpkg by adding 'grub_dpkg: {enabled: false}' to user-data. The users choice in subiquity on where to install grub is to be overridden by cc_grub_dpkg on first boot.
Form my point of view subiquity could either skip user choice and stick with auto-detection (like cc_grub_dpkg is doing by default) or let the user choose on witch disk grub is installed onto. In the auto-detection case cc_grub_dpkg is to be skipped since the installer already did auto-detection with an at least equally good mechanism. In the other case cc_grub_dpkg has to be skipped since cc_grub_dpkg auto-detection is over-ridding the users choice.

Revision history for this message
Daniel Krambrock (danielky) wrote :

Hi Brett.

As for cc_grub_dpkg:

In the case of /boot being on a mdraid cc_grub_dpkg is not fit for deciding on where to install grub onto. In my case there is:

```
~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 79.9M 1 loop /snap/lxd/22923
loop1 7:1 0 62M 1 loop /snap/core20/1587
loop2 7:2 0 47M 1 loop /snap/snapd/16292
loop3 7:3 0 48M 1 loop /snap/snapd/17336
loop4 7:4 0 63.2M 1 loop /snap/core20/1623
loop5 7:5 0 103M 1 loop /snap/lxd/23541
sda 8:0 0 16G 0 disk
├─sda1 8:1 0 2G 0 part [SWAP]
└─sda2 8:2 0 14G 0 part
  └─md0 9:0 0 14G 0 raid1
    └─vg0-lv_root 253:0 0 10G 0 lvm /
sdb 8:16 0 16G 0 disk
├─sdb1 8:17 0 2G 0 part [SWAP]
└─sdb2 8:18 0 14G 0 part
  └─md0 9:0 0 14G 0 raid1
    └─vg0-lv_root 253:0 0 10G 0 lvm /
```

And grub is installed on both sda and sdb.

cc_grub_dpkg runs `grub-probe -t disk /boot`:
```
~# grub-probe -t disk /boot
/dev/mapper/vg0-lv_root
```

And then:
```
~# udevadm info --root --query=symlink /dev/mapper/vg0-lv_root
/dev/disk/by-id/dm-uuid-LVM-BindKhTTGAfz7SYEYNKBhYfDBDMFj1rUd0gseyG07zzER7dQAaTIie7dxnIpxQZP /dev/disk/by-dname/vg0-lv_root /dev/vg0/lv_root /dev/mapper/vg0-lv_root /dev/disk/by-id/dm-name-vg0-lv_root /dev/disk/by-uuid/8f1733e5-498d-4f1b-aab7-f64acc678e8f
```

It picks the "disk/by-id" item (/dev/disk/by-id/dm-name-vg0-lv_root) and sets the grub-pc debconf item "grub-pc/install_devices" to that. This is in almost any case not a good choice.

Detecting the raid-setup and the right disks might be out of scope for cc_grub_dpkg, so why not just skip setting the "grub-pc/install_devices" item if it is already set in debconf and there is no "grub-pc/install_devices" given in cc_grub_dpkg config?

Bests, daniel

Revision history for this message
Daniel Krambrock (danielky) wrote :

Here are the requested logfiles.

Revision history for this message
Alberto Contreras (aciba) wrote :

Hello Daniel,

Maybe it is worth fully describing how do you drive the autoinstallation. If you run subiquity manually, describing what steps do you execute.

Additionally, the installer logs might also be useful. They are located under /var/log/installer, make sure to redact any sensible information.

> https://bugs.launchpad.net/cloud-init/+bug/1993503/comments/2

For the cases where grub_dpkg should be detected and it is not, a user can enforce that as workaround by adding the following in the autoinstall.yaml:

```
#cloud-config
autoinstall:
  version: 1
  # ...
  user-data:
    grub_dpkg:
      enabled: False
```

> https://bugs.launchpad.net/cloud-init/+bug/1993503/comments/3
> why not just skip setting the "grub-pc/install_devices" item if it is already set in debconf and there is no "grub-pc/install_devices" given in cc_grub_dpkg config?

That sounds like a reasonable behavior in this case, running an installer where grub was installed either by subiquity or by the user. But, in other scenarios, this might not fit or change how users expect grub_dpkg to behave. Per the docs [1], "This module should work correctly by default without any user configuration."

> It picks the "disk/by-id" item (/dev/disk/by-id/dm-name-vg0-lv_root) and sets the grub-pc debconf item "grub-pc/install_devices" to that. This is in almost any case not a good choice.

I agree, this is probably not a good choice in almost any case, but produces a bootable system. Assuming that, aside the change of behavior, I do not see any actionable point in this bug from cloud-init's perspective.

[1] https://cloudinit.readthedocs.io/en/latest/topics/modules.html#grub-dpkg

Revision history for this message
Daniel Krambrock (danielky) wrote : Re: [Bug 1993503] Re: grub_dpkg writes wrong device into debconf
Download full text (4.2 KiB)

Hello Alberto,

> Maybe it is worth fully describing how do you drive the
> autoinstallation. If you run subiquity manually, describing what steps
> do you execute.

As an example i did a manual install:
- stick to defaults but:
   - update to subiquity 22.10.1
   - keymap german
   - custom storage layout:
     - use vda and vdb as boot devices
     - raid-1 over vda2 vdb2
     - / on /dev/md0

This leaves /boot on /dev/md0. Within the installer, after install is
finished open a shell:
```
root@ubuntu-server:/# chroot /target/ debconf-show grub-pc | grep
grub-pc/install_devices:
* grub-pc/install_devices: /dev/vda, /dev/vdb
```
Everything looks good.

Reboot after install, in a shell:
```
root@danielk:/home/danielk# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
fd0 2:0 1 4K 0 disk
loop0 7:0 0 62M 1 loop /snap/core20/1587
loop1 7:1 0 47M 1 loop /snap/snapd/16292
loop2 7:2 0 79.9M 1 loop /snap/lxd/22923
sr0 11:0 1 1024M 0 rom
vda 252:0 0 10G 0 disk
├─vda1 252:1 0 1M 0 part
└─vda2 252:2 0 10G 0 part
   └─md0 9:0 0 10G 0 raid1
     └─md0p1 259:0 0 10G 0 part /
vdb 252:16 0 10G 0 disk
├─vdb1 252:17 0 1M 0 part
└─vdb2 252:18 0 10G 0 part
   └─md0 9:0 0 10G 0 raid1
     └─md0p1 259:0 0 10G 0 part /

root@danielk:/home/danielk# debconf-show grub-pc | grep
grub-pc/install_devices:
* grub-pc/install_devices: /dev/disk/by-id/md-name-ubuntu-server:0
root@danielk:/home/danielk# ls -l /dev/disk/by-id/md-name-ubuntu-server\:0
lrwxrwxrwx 1 root root 9 Oct 31 10:35
/dev/disk/by-id/md-name-ubuntu-server:0 -> ../../md0
```
cloud-init changes 'grub-pc/install_devices' to
'/dev/disk/by-id/md-name-ubuntu-server:0' which is a symlink to '/dev/md0'.

> For the cases where grub_dpkg should be detected and it is not, a user
> can enforce that as workaround by adding the following in the
> autoinstall.yaml:
>
> ```
> #cloud-config
> autoinstall:
> version: 1
> # ...
> user-data:
> grub_dpkg:
> enabled: False
> ```

This is only possible for autoinstall. This bugreport is partly about
why disabling grub_dpkg in user-data better be the default for both
manual- and auto-install. (see
https://bugs.launchpad.net/cloud-init/+bug/1993503/comments/2)

> But, in other scenarios, this might not fit or change how users expect
> grub_dpkg to behave. Per the docs [1], "This module should work
> correctly by default without any user configuration."

 From my point of view it does not work correctly.

> I agree, this is probably not a good choice in almost any case, but
> produces a bootable system. Assuming that, aside the change of behavior,
> I do not see any actionable point in this bug from cloud-init's
> perspective.

This is not about producing a bootable system or not. It just sets the
debconf-entry to a wrong device. With the next grub-update debconf let
you choose the correct devices. Running for example 'dpkg-reconfigure
grub-pc' in dialog frontend on the described setup asks to choose a disk
to install grub onto with no device preselected since '/d...

Read more...

Revision history for this message
Brett Holman (holmanb) wrote :

> This is only possible for autoinstall. This bugreport is partly about why disabling grub_dpkg in user-data better be the default for both manual- and auto-install.

Since the requested behavior change can be driven by different user-data from subiquity, I think this doesn't require cloud-init changes.

Since this doesn't require changes to cloud-init, and cloud-init will not run this module unless configured to do so, I'm setting this bug to invalid for cloud-init. I believe this to be a subiquity change.

Please update this bug if you think this is incorrect.

Changed in cloud-init:
status: New → Invalid
Revision history for this message
Daniel Krambrock (danielky) wrote :

Hi Brett,
you are right, this is to be fixed in subiquity. For subiquity there is no need to run cc_grub_dpkg as shown.

On the other hand, if you choose to run cc_grub_dpkg, it doesn't work correctly at least in the two shown cases (LVM and md-raid). This is why i think this is also a cloud-init bug.

Revision history for this message
Chad Smith (chad.smith) wrote :

Thanks for the detailed discussions here @danielky.

I think I agree with your last statement in comment #3

> Detecting the raid-setup and the right disks might be out of scope for cc_grub_dpkg, so why not just skip setting the "grub-pc/install_devices" item if it is already set in debconf and there is no "grub-pc/install_devices" given in cc_grub_dpkg config?

In the case of the live installer, subiquity (via curtin) ends up setting grub debconf selections. I do think cloud-init's cc_grub_dpkg::fetcg_idevs function[1] could probably grow awareness of debconf selections as that should be honored with precedence above local discovery using grub probe.

I'll set this to triaged and confirm with the team tomorrow that this is a reasonable approach/consensus for cloud-init.

As for the subiquity aspect of this bug, I agree is seems like providing either:

grub_dpkg:
  enabled: false

-- or --
# reflecting the specific devices that were chosen during auto install would make the most sense to ensure the system boots properly configured and cloud-init does not overwrite values setup by the installer (via curtin)

grub_dpkg:
  grub-pc/install_devices: <dev1> <dev2>

Changed in cloud-init:
status: Invalid → Triaged
importance: Undecided → Medium
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Ugh. What a mess. I think I agree that subiquity could disable cc_grub_dpkg by default but also I really think cc_grub_dpkg should not overwrite an existing debconf entry for grub-pc/install_devices.

Revision history for this message
Alexander Birkner (tyrola) wrote :

"but also I really think cc_grub_dpkg should not overwrite an existing debconf entry for grub-pc/install_devices".

What's with pre-build images, we are building our images with Packer and since they are build on real hardware, there is already a grub-pc/install_devices set. virt-sysprep does not remove this entries as well (which would make sense).

So currently cloud-init is taking care about removing the old entry from the image build host and fixing it to the correct one on the "real" hardware. If cloud-init would no longer changing existing entries it would be required for everyone who uses pre-build images to remove this images before taking the image - which could end up in more confusion.

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.