Update-grub fails with a second encrypted, but locked rpool
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
grub2 (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Hello,
I've updated my Ubuntu 20.04 to 22.04 by creating a new root pool and installing Ubuntu. I kept the old root pool around in case something went wrong. Both root pools use ZFS encryption.
I was running into a problem when having only the new root pool unlocked and trying to run update-grub. The old pool was not imported, and hence no key was available.
Running update-grub led to an empty grub.cfg file (only the boilerplate, no stanzas), and there was no apparent error message when running update-grub.
After a few hours of debugging, I realised that the script dies somewhere in the middle when trying to mount from the inactive root pool and fails due to no key. However, the script continues running and zeros out the grub.cfg file.
Only later I realised that there was a message logged that I didn't pay much attention to as the script finished without problems: zfs_mount_at() failed: encryption key not loaded
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/
Generating grub configuration file ...
Some pools couldn't be imported and will be ignored:
cannot import 'zfs_data_old': one or more devices is currently unavailable
Found linux image: vmlinuz-
Found initrd image: initrd.
Found linux image: vmlinuz-
Found initrd image: initrd.
zfs_mount_at() failed: encryption key not loadedWarning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_
Adding boot menu entry for UEFI Firmware Settings ...
done
This is terrible UI, as the script finishes, but it leaves the system unbootable. The error message isn't apparent, and it's unclear where it comes from, especially since I haven't imported the old root pool (but update-grub still finds it and tries to go through it).
Can you please ensure that the error message is more explicit and that the script fails instead of leaving the system unbootable?
I believe the issue is in get_dataset_info() of 10_linux_zfs, but my shell scripting skills are poor at best.
Thanks,
Danny