grub-mkconfig fails with separate ZFS /boot on EFI systems: "Warning: didn't find any valid initrd or kernel"

Bug #2110042 reported by Louis Sautier
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hello,
/etc/grub.d/10_linux_zfs fails to detect kernels when the partitions have the following layout:
root@test ~ $ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme1n1 259:0 0 476.9G 0 disk
├─nvme1n1p1 259:2 0 511M 0 part
└─nvme1n1p2 259:3 0 20.5G 0 part
nvme0n1 259:1 0 476.9G 0 disk
├─nvme0n1p1 259:4 0 511M 0 part /boot/efi
├─nvme0n1p2 259:5 0 20.5G 0 part
└─nvme0n1p3 259:6 0 2M 0 part
root@test ~ $ lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
nvme1n1
├─nvme1n1p1 vfat FAT16 EFI_SYSPART 59A4-242A
└─nvme1n1p2 zfs_member 5000 zp0 7865821403877262216
nvme0n1
├─nvme0n1p1 vfat FAT16 EFI_SYSPART 599B-27B0 906M 12% /boot/efi
├─nvme0n1p2 zfs_member 5000 zp0 7865821403877262216
└─nvme0n1p3 iso9660 Joliet Extension config-2 2025-04-30-19-10-22-00
root@test ~ $ cat /etc/fstab
LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1
root@test ~ $ zfs list
NAME USED AVAIL REFER MOUNTPOINT
zp0 6.66G 13.2G 24K none
zp0/zd0 118M 906M 118M /boot
zp0/zd1 6.55G 13.0G 6.55G /

This happens because /boot/efi is mounted before /boot, causing a /boot/efi folder to be created in the root (zp0/zd1) dataset (I deleted it and it's back at the next boot):
root@test ~ $ mount -o noatime,zfsutil -t zfs zp0/zd1 /mnt/
root@test ~ $ ls -la /mnt/boot/
total 3
drwxr-xr-x 3 root root 3 May 5 19:22 .
drwxr-xr-x 18 root root 24 May 6 11:38 ..
drwxr-xr-x 2 root root 2 May 5 19:22 efi
root@test ~ $ stat /mnt/boot/efi/
  File: /mnt/boot/efi/
  Size: 2 Blocks: 1 IO Block: 131072 directory
Device: 0,29 Inode: 99053 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2025-05-05 19:22:21.489078537 +0000
Modify: 2025-05-05 19:22:21.489078537 +0000
Change: 2025-05-05 19:22:21.489078537 +0000
 Birth: 2025-05-05 19:22:21.489078537 +0000
root@test ~ $ journalctl -b -o short-iso-precise --grep boot-efi.mount
2025-05-05T19:22:21.493367+00:00 test systemd[1]: Mounting boot-efi.mount - /boot/efi...
2025-05-05T19:22:21.540543+00:00 test systemd[1]: Mounted boot-efi.mount - /boot/efi.

The fact that the root dataset's /boot is not empty causes this condition to be valid:
https://git.launchpad.net/~ubuntu-core-dev/grub/+git/ubuntu/tree/debian/patches/ubuntu-zfs-enhance-support.patch?h=debian/2.12-5ubuntu5.3#n277
Because of this, the script assumes that the boot dataset is zp0/zd1 (the root dataset) instead of zp0/zd0:
root@test ~ $ grub-mkconfig
Sourcing file `/etc/default/grub'
Generating grub configuration file ...
[…]
### BEGIN /etc/grub.d/10_linux_zfs ###
Warning: didn't find any valid initrd or kernel.
### END /etc/grub.d/10_linux_zfs ###
[…]

Revision history for this message
Louis Sautier (lesbraz) wrote (last edit ):

The following patch fixes this by excluding the efi directory from the "ls" call. It applies on https://git.launchpad.net/~ubuntu-core-dev/grub/+git/ubuntu/tag/?h=debian/2.12-5ubuntu10

I'm not sure it's the best solution, maybe upstream's 10_linux should be used instead? Are there still good reasons to maintain a separate file specifically for Ubuntu. For instance, Debian seems to handle ZFS properly with 10_linux.

EDIT: please disregard this and read my next comment.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "0001-Fix-grub-mkconfig-with-separate-ZFS-dataset-for-boot.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Louis Sautier (lesbraz) wrote :

Actually, the root cause was the fact that /boot/efi was mounted before /boot, "ls /boot/efi" returned nothing.

This can be solved by properly activating zfs-mount-generator which creates one unit per mountpoint as mentioned in #1904764.

To do this, I ran the following script (but there is also an example which relies on zfs-sed in the manpage: https://openzfs.github.io/openzfs-docs/man/master/8/zfs-mount-generator.8.html#EXAMPLES).
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec\
,readonly,setuid,nbmand,encroot,keylocation\
,org.openzfs.systemd:requires,org.openzfs.systemd:requires-mounts-for\
,org.openzfs.systemd:before,org.openzfs.systemd:after\
,org.openzfs.systemd:wanted-by,org.openzfs.systemd:required-by\
,org.openzfs.systemd:nofail,org.openzfs.systemd:ignore"
mkdir /etc/zfs/zfs-list.cache/
zpool list -H -o name | while IFS= read -r pool; do
   zfs list -H -t filesystem -o "$PROPS" -r "$pool" > "/etc/zfs/zfs-list.cache/$pool"
done
At the next boot, there was a separate boot.mount unit which was activated before boot-efi.mount, so the extraneous "boot/efi" directory inside the / pool was no longer present.

tags: removed: patch
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.