Update-grub fails with a second encrypted, but locked rpool

Bug #2016778 reported by Danny
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hello,

I've updated my Ubuntu 20.04 to 22.04 by creating a new root pool and installing Ubuntu. I kept the old root pool around in case something went wrong. Both root pools use ZFS encryption.

I was running into a problem when having only the new root pool unlocked and trying to run update-grub. The old pool was not imported, and hence no key was available.

Running update-grub led to an empty grub.cfg file (only the boilerplate, no stanzas), and there was no apparent error message when running update-grub.

After a few hours of debugging, I realised that the script dies somewhere in the middle when trying to mount from the inactive root pool and fails due to no key. However, the script continues running and zeros out the grub.cfg file.

Only later I realised that there was a message logged that I didn't pay much attention to as the script finished without problems: zfs_mount_at() failed: encryption key not loaded

Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Some pools couldn't be imported and will be ignored:
cannot import 'zfs_data_old': one or more devices is currently unavailable
Found linux image: vmlinuz-5.15.0-69-generic in rpool/ROOT/ubuntu_2204
Found initrd image: initrd.img-5.15.0-69-generic in rpool/ROOT/ubuntu_2204
Found linux image: vmlinuz-5.15.0-58-generic in rpool/ROOT/ubuntu_2204
Found initrd image: initrd.img-5.15.0-58-generic in rpool/ROOT/ubuntu_2204
zfs_mount_at() failed: encryption key not loadedWarning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
Adding boot menu entry for UEFI Firmware Settings ...
done

This is terrible UI, as the script finishes, but it leaves the system unbootable. The error message isn't apparent, and it's unclear where it comes from, especially since I haven't imported the old root pool (but update-grub still finds it and tries to go through it).

Can you please ensure that the error message is more explicit and that the script fails instead of leaving the system unbootable?

I believe the issue is in get_dataset_info() of 10_linux_zfs, but my shell scripting skills are poor at best.

Thanks,
Danny

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.