(focal update-grub) 10_linux_zfs: causes lots of warnings (invalid version number, bpool does not exist)

Bug #1898177 reported by Walter
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Fix Released
Medium
Jean-Baptiste Lallement
Focal
Triaged
Medium
Unassigned
Groovy
Won't Fix
Medium
Unassigned
grubzfs-testsuite (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Confirmed
Undecided
Unassigned
Groovy
Won't Fix
Undecided
Unassigned

Bug Description

On our upgrade Focal system (which had ZFS-on-root _before_ this was an installer option in Ubuntu) we're seeing these non-fatal warnings when running update-grub:

```
# update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
cannot open 'bpool/BOOT/ubuntu': dataset does not exist
Found linux image: vmlinuz-5.4.0-48-generic in rpool/ROOT/ubuntu
Found initrd image: initrd.img-5.4.0-48-generic in rpool/ROOT/ubuntu
Found linux image: vmlinuz-4.15.0-118-generic in rpool/ROOT/ubuntu
Found initrd image: initrd.img-4.15.0-118-generic in rpool/ROOT/ubuntu

(start loop)

cannot open 'bpool/BOOT/ubuntu@2019-12-06-zabbix-3.4': dataset does not exist
dpkg: warning: version '12-06-zabbix-3.4/boot/vmlinuz-4.4.0-170-generic' has bad syntax: invalid character in version number
dpkg: warning: version '12-06-zabbix-3.4/boot/vmlinuz-4.4.0-151-generic' has bad syntax: invalid character in version number
Found linux image: vmlinuz-4.4.0-170-generic in rpool/ROOT/ubuntu@2019-12-06-zabbix-3.4
Found initrd image: initrd.img-4.4.0-170-generic in rpool/ROOT/ubuntu@2019-12-06-zabbix-3.4
Found linux image: vmlinuz-4.4.0-151-generic in rpool/ROOT/ubuntu@2019-12-06-zabbix-3.4
Found initrd image: initrd.img-4.4.0-151-generic in rpool/ROOT/ubuntu@2019-12-06-zabbix-3.4

(repeat loop for a number of times)

Adding boot menu entry for UEFI Firmware Settings
done
```

There are two problems here:

- checking of files on 'bpool', even though I only have an 'rpool' zpool; having that error appear for all snapshots is awkward

- the invalid character in version number (caused by the version_find_latest not coping with full path names)

First error:
```
$ update-grub 2>&1 | grep 'does not'
cannot open 'bpool/BOOT/ubuntu': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2019-12-06-zabbix-3.4': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2019-12-06-zabbix-4.4': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@save-2020-02-14-a': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2020-09-30-pre-focal': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0500Z': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0600Z': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0700Z': dataset does not exist
cannot open 'bpool/BOOT/ubuntu': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2019-12-06-zabbix-3.4': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2019-12-06-zabbix-4.4': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@save-2020-02-14-a': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@2020-09-30-pre-focal': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0500Z': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0600Z': dataset does not exist
cannot open 'bpool/BOOT/ubuntu@planb-20201002T0700Z': dataset does not exist
```

This is really confusing if you haven't set up a bpool zpool at all.

Simple fix:
```
--- a/grub.d/10_linux_zfs
+++ b/grub.d/10_linux_zfs
@@ -369,6 +369,7 @@ is_secure_boot_enabled() {
 get_dataset_info() {
     local dataset="$1"
     local mntdir="$2"
+ local has_bpool="$3"

     local base_dataset="${dataset}"
     local etc_dir="${mntdir}/etc"
@@ -405,7 +406,10 @@ get_dataset_info() {
     mountpoint -q "${mntdir}/etc" && umount "${mntdir}/etc" || true

     # read available kernels from /boot
- boot_dir="$(try_default_layout_bpool "${dataset}" "${mntdir}")"
+ local boot_dir=
+ if [ $has_bpool -eq 1 ]; then
+ boot_dir="$(try_default_layout_bpool "${dataset}" "${mntdir}")"
+ fi
     if [ -z "${boot_dir}" ]; then
         boot_dir=$(get_system_directory "${dataset}" "boot" "false" "${mntdir}" "${etc_dir}")
     fi
@@ -534,14 +538,15 @@ get_dataset_info() {
 bootlist() {
     local mntdir="$1"
     local boot_list=""
+ local has_bpool=$(zpool list -Honame bpool >/dev/null 2>&1 && echo 1 || echo 0)

     for dataset in $(get_root_datasets); do
         # get information from current root dataset
- boot_list="${boot_list}$(get_dataset_info ${dataset} ${mntdir})\n"
+ boot_list="${boot_list}$(get_dataset_info ${dataset} ${mntdir} ${has_bpool})\n"

         # get information from snapshots of this root dataset
         for snapshot_dataset in $(zfs list -H -o name -t snapshot "${dataset}"); do
- boot_list="${boot_list}$(get_dataset_info ${snapshot_dataset} ${mntdir})\n"
+ boot_list="${boot_list}$(get_dataset_info ${snapshot_dataset} ${mntdir} ${has_bpool})\n"
         done
     done
     echo "${boot_list}"
```
Now it will skip trying the bpool if no such pool exists.

The second error is:
```
# update-grub 2>&1 | grep 'invalid character'
dpkg: warning: version '12-06-zabbix-3.4/boot/vmlinuz-4.4.0-170-generic' has bad syntax: invalid character in version number
dpkg: warning: version '12-06-zabbix-3.4/boot/vmlinuz-4.4.0-151-generic' has bad syntax: invalid character in version number
dpkg: warning: version '12-06-zabbix-4.4/boot/vmlinuz-4.4.0-170-generic' has bad syntax: invalid character in version number
...
dpkg: warning: version '20201002T0600Z/boot/vmlinuz-4.15.0-118-generic' has bad syntax: invalid character in version number
dpkg: warning: version '20201002T0700Z/boot/vmlinuz-5.4.0-48-generic' has bad syntax: invalid character in version number
dpkg: warning: version '20201002T0700Z/boot/vmlinuz-4.15.0-118-generic' has bad syntax: invalid character in version number
```

They are caused by the dash (hyphen) in the directory name, and passing the full path to the version_find_latest function.

See this:
```
# . /usr/share/grub/grub-mkconfig_lib

# version_find_latest app1-1.9.3.4 app1-1foo-bar-baz app1-1.10
app1-1.10

# version_find_latest /path/to/app1-1.9.3.4 /other/path/app1-1foo-bar-baz /somewhere/app1-1.10
/somewhere/app1-1.10

# version_find_latest /path/to/.zfs/snapshot-X/app1-1.9.3.4 /somewhere/app1-1.10
dpkg: warning: version 'X/app1-1.9.3.4' has bad syntax: version number does not start with digit
/path/to/.zfs/snapshot-X/app1-1.9.3.4
```

We could fix the grub-mkconfig_lib version_find_latest, or simply pass the basename of the files to that function, effectively turning that last call into:
```
# version_find_latest app1-1.9.3.4 app1-1.10
app1-1.10
```

Changes required:
```
--- a/grub.d/10_linux_zfs
+++ b/grub.d/10_linux_zfs
@@ -422,7 +422,9 @@ get_dataset_info() {
     kernel_list=""
     list=$(find "${boot_dir}" -maxdepth 1 -type f -regex '.*/\(vmlinuz\|vmlinux\|kernel\)-.*')
     while [ "x$list" != "x" ] ; do
- linux=`version_find_latest $list`
+ list_basename=$(echo $list | tr ' ' '\n' | sed -e 's#.*/##')
+ linux=`version_find_latest $list_basename`
+ linux=$(for l in $list; do test "${l##*/}" = "${linux}" && echo "$l" && break; done)
         list=`echo $list | tr ' ' '\n' | fgrep -vx "$linux" | tr '\n' ' '`
         if ! grub_file_is_not_garbage "${linux}" ; then
             continue
```

With both patches in place, the output of `update-grub` is not 113 lines, but 37. And all warnings are gone.

Patch attached!

Cheers,
Walter Doekes
OSSO B.V.

Tags: patch
Revision history for this message
Walter (wdoekes) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "grub2-fix-10_linux_zfs-warnings.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Changed in grub2 (Ubuntu):
importance: Undecided → Low
status: New → Triaged
assignee: nobody → Jean-Baptiste Lallement (jibel)
importance: Low → Medium
status: Triaged → In Progress
Changed in grub2 (Ubuntu Focal):
status: New → Triaged
Changed in grub2 (Ubuntu Groovy):
status: New → Triaged
Changed in grub2 (Ubuntu Focal):
importance: Undecided → Medium
Changed in grub2 (Ubuntu Groovy):
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grubzfs-testsuite (Ubuntu Focal):
status: New → Confirmed
Changed in grubzfs-testsuite (Ubuntu Groovy):
status: New → Confirmed
Changed in grubzfs-testsuite (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grubzfs-testsuite - 0.4.14

---------------
grubzfs-testsuite (0.4.14) hirsute; urgency=medium

  * Revert "Add testcases to cover names with spaces and dashes", as
    matching grub upload is not in hirsute.

grubzfs-testsuite (0.4.13) hirsute; urgency=medium

  * Switch to upstream branch of go-libzfs with 2.0.x support.

grubzfs-testsuite (0.4.12) hirsute; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Added test cases to cover snapshot names with spaces or dashes
    (LP: #1898177, #1903524)

 -- Dimitri John Ledkov <email address hidden> Fri, 05 Feb 2021 19:18:25 +0000

Changed in grubzfs-testsuite (Ubuntu):
status: Confirmed → Fix Released
Changed in grubzfs-testsuite (Ubuntu):
status: Fix Released → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.04-1ubuntu40

---------------
grub2 (2.04-1ubuntu40) hirsute; urgency=medium

  * Revert: rhboot-f34-tcp-add-window-scaling-support.patch,
    rhboot-f34-support-non-ethernet.patch,
    ubuntu-fixup-rhboot-f34-support-non-ethernet.patch,
    ubuntu-fixup-rhboot-f34-support-non-ethernet-2.patch: these break MAAS
    LXD KVM pod deployments. LP: #1915288

grub2 (2.04-1ubuntu39) hirsute; urgency=medium

  * Cherrypick a bunch of patches:
    - fix crash in http LP: #1915288
    - add bootp6 documentation
    - add support for UEFI boot protocols
    - use UEFI protocols for http & https networking
    - make netboot search for by-mac/by-uuid/by-ip for grub.cfg
    - update documentation for netboot search paths of grub.cfg
  * Make prebuilt netboot image look for MAAS grub.cfg
  * Fix grub-initrd-fallback.service thanks to JawnSmith LP: #1910815

grub2 (2.04-1ubuntu38) hirsute; urgency=medium

  [ Jean-Baptiste Lallement ]
  [ Didier Roche ]
  * Fix warnings during grub menu generation. Thanks wdoekes for the patch
    (LP: #1898177)
    - Fix warnings when bpool doesn't exist.
    - Fix warnings when snapshot name contains dashes.
  * Do not fail to generate grub menu when name of the snapshot contains
    spaces. (LP: #1903524)

 -- Dimitri John Ledkov <email address hidden> Fri, 12 Feb 2021 20:29:16 +0000

Changed in grub2 (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grubzfs-testsuite - 0.4.15build1

---------------
grubzfs-testsuite (0.4.15build1) hirsute; urgency=medium

  * No change rebuild with fixed ownership.

 -- Dimitri John Ledkov <email address hidden> Tue, 16 Feb 2021 15:15:19 +0000

Changed in grubzfs-testsuite (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

The Groovy Gorilla has reached end of life, so this bug will not be fixed for that release

Changed in grub2 (Ubuntu Groovy):
status: Triaged → Won't Fix
Changed in grubzfs-testsuite (Ubuntu Groovy):
status: Confirmed → Won't Fix
Revision history for this message
Alexander Yesipov (satter) wrote :

Will the fix be released for ubuntu focal?

Revision history for this message
Tim K. (tkubnt) wrote :

Can the fix be backported to focal please?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.