grub-probe zfs bug: failed to get canonical path

Bug #1687664 reported by Jens Elkner on 2017-05-02
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Undecided
Unassigned
zfs-linux (Ubuntu)
Undecided
Unassigned

Bug Description

grub-probe /

fails with grub-probe: error: failed to get canonical path of `/dev/HDD0p2'. This is simply because grub makes the wrong assumption, that "zpool status $pool" lists the vdevs in use only with the '/dev/' prefix stripped off. It probably assumes something like /dev/sda etc., which is discouraged to use.

Instead, grub should use "zpool status -P $pool" to get the full device path. This would probably result into a symlink, e.g. /dev/disk-by-id/$bla . If this is not sufficient, grub should use realpath() to get the final blockdev entry like /dev/sda1 .

The current setup I use is:

433 0 drwxr-xr-x 4 root root 80 Apr 28 20:22 /dev/chassis/SYS
434 0 drwxr-xr-x 2 root root 200 Apr 28 23:06 /dev/chassis/SYS/HDD0
435 0 lrwxrwxrwx 1 root root 12 Apr 28 23:16 /dev/chassis/SYS/HDD0/HDD0 -> ../../../sda
417 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/HDD0p1 -> ../../../sda1
442 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/HDD0p2 -> ../../../sda2
423 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/HDD0p9 -> ../../../sda9
436 0 lrwxrwxrwx 1 root root 12 Apr 28 23:16 /dev/chassis/SYS/HDD0/disk -> ../../../sda
418 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/p1 -> ../../../sda1
443 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/p2 -> ../../../sda2
424 0 lrwxrwxrwx 1 root root 13 Apr 28 23:16 /dev/chassis/SYS/HDD0/p9 -> ../../../sda9
437 0 drwxr-xr-x 2 root root 80 Apr 28 20:22 /dev/chassis/SYS/HDD1
438 0 lrwxrwxrwx 1 root root 12 Apr 28 23:16 /dev/chassis/SYS/HDD1/HDD1 -> ../../../sdb
439 0 lrwxrwxrwx 1 root root 12 Apr 28 23:16 /dev/chassis/SYS/HDD1/disk -> ../../../sdb

with 'zpool create ... rpool /dev/chassis/SYS/HDD0/HDD0p2'

PS: The full version is 2.02~beta2-36ubuntu3.9 and ZoL 0.6.5.6-0ubuntu15 (Ubuntu xenial 16.04).

For simulation one may setup a VirtualBox VM, with Storage == Controller SAS (Type: LsiLogic SAS, Port Count: 8), and e.g. one VDI attached to it (/dev/sda) and e.g. netboot an install image. When it comes to the 'partitioning disks' dialog, start a shell and fetch/install ksh93 (i.e. ksh-udeb).
Than the script http://iks.cs.ovgu.de/~elkner/tmp/ubuntu/bayLinks.sh can be used, to create the required dev symlinks (e.g. bayLinks.sh -mru). For convenience http://iks.cs.ovgu.de/~elkner/tmp/ubuntu/ubuntu-part.sh can be used to create the script for [re-]partitioning the drive[s] and setup zfs related stuff (whereby the 'zpool create' and consecutive stuff will fail because of the bug). E.g. /tmp/ubuntu-part.sh /dev/chassis/SYS/HDD0/HDD0 .

Jens Elkner (jelmd) wrote :

FWIW: Found another case, where grub2 fails:

May 3 15:37:28 in-target: grub-common is already the newest version (2.02~beta2-36ubuntu3.9).
May 3 15:37:28 in-target: 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
May 3 15:37:28 main-menu[422]: (process:4764): grub-probe: error: failed to get canonical path of `rpool/ROOT/linux'.
May 3 15:37:28 main-menu[422]: (process:4764): grub-probe: error: failed to get canonical path of `rpool/ROOT/linux'.
May 3 15:37:28 main-menu[422]: WARNING **: Configuring 'grub-installer' failed with error code 1
May 3 15:37:28 main-menu[422]: WARNING **: Menu item 'grub-installer' failed.
May 3 15:37:40 main-menu[422]: INFO: Falling back to the package description for brltty-udeb
^C

~ # zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 163G 760G 19K none
rpool/ROOT 2.55G 760G 19K none
rpool/ROOT/linux 2.55G 760G 1.61G /target
rpool/ROOT/linux/home 19K 760G 19K legacy
rpool/ROOT/linux/var 963M 760G 963M legacy
rpool/VARSHARE 19K 760G 19K /target/var/share
rpool/dump 64.0G 824G 8K -
rpool/swap 96.0G 856G 8K -

~ # zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   nvme0n1p2 ONLINE 0 0 0

~ # zpool status -P
  pool: rpool
 state: ONLINE
  scan: none requested
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   /dev/nvme0n1p2 ONLINE 0 0 0

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Changed in zfs-linux (Ubuntu):
status: New → Confirmed
no longer affects: zfs
no longer affects: grub
Jens Elkner (jelmd) wrote :

BTW: Just encountered, that netinstall fails, when it tries to install grub, because /sys seems not to be mounted to /target/sys:

# zpool status
The ZFS modules are not loaded.
Try running '/sbin/modprobe zfs' as root to load them.

After a mount --rbind /sys /target/sys :
# zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   HDD0p2 ONLINE 0 0 0

errors: No known data errors

Jens Elkner (jelmd) wrote :

BUG or not? At least I get mixed feelings, when I see, that the log device gets passed to grub-probe! E.g.:

+ zpool status rpool
  pool: rpool
 state: ONLINE
  scan: resilvered 547M in 0h0m with 0 errors on Thu May 18 03:49:08 2017
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     HDD0p2 ONLINE 0 0 0
     HDD12p2 ONLINE 0 0 0
 logs
   HDD2p3 ONLINE 0 0 0
 spares
   HDD1p2 AVAIL

errors: No known data errors

+ zpool status -P rpool
  pool: rpool
 state: ONLINE
  scan: resilvered 547M in 0h0m with 0 errors on Thu May 18 03:49:08 2017
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     /dev/chassis/SYS/HDD0p2 ONLINE 0 0 0
     /dev/chassis/SYS/HDD12p2 ONLINE 0 0 0
 logs
   /dev/chassis/SYS/HDD2p3 ONLINE 0 0 0
 spares
   /dev/chassis/SYS/HDD1p2 AVAIL

errors: No known data errors

and than on update-grub I see:

...
prepare_grub_to_access_device /dev/sda2 /dev/sdf2 /dev/sdb3
+ old_ifs=

+ IFS=

+ /usr/sbin/grub-probe --device /dev/sda2 /dev/sdf2 /dev/sdb3 --target=partmap
/usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/sda2. Check your device.map.
...

Michael Schnaitter (stereolame) wrote :

there is definitely a bug in the fact that grub-probe is still just looking for /dev/[device-id] instead of /dev/disk/by-id/[device-id] which breaks everything every time you have a kernel update unless you make those symlinks. this was supposedly fixed but you and I are both having the issue still.

Jens Elkner (jelmd) wrote :

Wondering, whether there is an official repository, where one may look into the source of the shipped version of the package. I found https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-36ubuntu3.9, but even the grub2_2.02~beta2.orig.tar.xz is != to what is tagged as grub-2.02-beta2 in the official repo (https://git.savannah.gnu.org/git/grub.git) ... grrrr ...

Richard Laager (rlaager) wrote :

This is a duplicate of 1527727. It has been fixed in 16.10 and newer, but has not been backported to 16.04.

Jens Elkner (jelmd) wrote :

No ETAs, when the backport is available for the LTS release?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers