failed boot mount, and systemd ordering cycle on zfs-import.target

Bug #1904764 reported by Christopher Hill
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

On a new Ubuntu 20.10 ZoL root I frequently get boot failures, and I see these errors:

[ 11.353930] systemd[1]: boot.mount: Mount process exited, code=exited, status=1/FAILURE
[ 11.353934] systemd[1]: boot.mount: Failed with result 'exit-code'.
[ 11.354267] systemd[1]: Failed to mount /boot.
[ 11.354307] systemd[1]: Dependency failed for /boot/efi.
[ 11.354327] systemd[1]: Dependency failed for /boot/grub.
[ 11.354344] systemd[1]: Dependency failed for Local File Systems.
[ 11.354365] systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
[ 11.354367] systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
[ 11.355088] systemd[1]: var-lib.mount: Found ordering cycle on zfs-import.target/start
[ 11.355090] systemd[1]: var-lib.mount: Found dependency on zfs-import-cache.service/start
[ 11.355091] systemd[1]: var-lib.mount: Found dependency on zfs-load-module.service/start
[ 11.355093] systemd[1]: var-lib.mount: Found dependency on cryptsetup.target/start
[ 11.355094] systemd[1]: var-lib.mount: Found dependency on <email address hidden>/start
[ 11.355096] systemd[1]: var-lib.mount: Found dependency on systemd-random-seed.service/start
[ 11.355097] systemd[1]: var-lib.mount: Found dependency on var-lib.mount/start
[ 11.355099] systemd[1]: var-lib.mount: Job zfs-import.target/start deleted to break ordering cycle starting with var-lib.mount/start
[ 11.355819] systemd[1]: boot-grub.mount: Job boot-grub.mount/start failed with result 'dependency'.
[ 11.355823] systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.

Journalctl shows "filesystem 'bpool/BOOT/ubuntu_o9gk00' cannot be mounted, unable to open the dataset" for boot.mount

I can manually start boot.mount from the console, which then lets the boot process complete.

N.B. "zfs list" is

NAME USED AVAIL REFER MOUNTPOINT
bpool 402M 1.36G 96K /boot
bpool/BOOT 401M 1.36G 96K none
bpool/BOOT/ubuntu_o9gk00 401M 1.36G 201M /boot
rpool 171G 720G 192K /
rpool/ROOT 9.26G 720G 192K none
rpool/ROOT/ubuntu_ 1.57G 720G 5.16G /
rpool/ROOT/ubuntu_o9gk00 7.69G 720G 4.24G /
rpool/ROOT/ubuntu_o9gk00/srv 192K 720G 192K /srv
rpool/ROOT/ubuntu_o9gk00/usr 1.02M 720G 192K /usr
rpool/ROOT/ubuntu_o9gk00/usr/local 852K 720G 388K /usr/local
rpool/ROOT/ubuntu_o9gk00/var 2.09G 720G 192K /var
rpool/ROOT/ubuntu_o9gk00/var/games 192K 720G 192K /var/games
rpool/ROOT/ubuntu_o9gk00/var/lib 1.85G 720G 1.42G /var/lib
rpool/ROOT/ubuntu_o9gk00/var/lib/AccountsService 1.39M 720G 496K /var/lib/AccountsService
rpool/ROOT/ubuntu_o9gk00/var/lib/NetworkManager 3.39M 720G 292K /var/lib/NetworkManager
rpool/ROOT/ubuntu_o9gk00/var/lib/apt 87.2M 720G 78.0M /var/lib/apt
rpool/ROOT/ubuntu_o9gk00/var/lib/dpkg 148M 720G 63.6M /var/lib/dpkg
rpool/ROOT/ubuntu_o9gk00/var/log 223M 720G 94.2M /var/log
rpool/ROOT/ubuntu_o9gk00/var/mail 1.19M 720G 224K /var/mail
rpool/ROOT/ubuntu_o9gk00/var/snap 940K 720G 500K /var/snap
rpool/ROOT/ubuntu_o9gk00/var/spool 26.7M 720G 5.52M /var/spool
rpool/ROOT/ubuntu_o9gk00/var/www 192K 720G 192K /var/www
rpool/USERDATA 162G 720G 192K /
rpool/USERDATA/chris_hwz0bd 162G 720G 156G /home/chris
rpool/USERDATA/root_hwz0bd 2.91M 720G 804K /root
rpool/keystore 518M 720G 48.1M -

description: updated
Revision history for this message
Christopher Hill (ch6574) wrote :

BTW, I ran the following and that seems to have solved the missing bpool.

zpool set cachefile=/etc/zfs/zpool.cache bpool
zpool set cachefile=/etc/zfs/zpool.cache rpool
update-initramfs -u -k all

I have a hunch the zpool.cache file is getting out of sync with the initramfs one..?

Revision history for this message
Christopher Hill (ch6574) wrote :
Download full text (4.2 KiB)

Hmm, issue still happens, and I'm now suspicious of the dependencies in systemd.

$ systemctl list-dependencies boot.mount
boot.mount
● ├─-.mount
● ├─system.slice
● └─zfs-import.target
● └─zfs-import-cache.service

I see "boot.mount" depends on "zfs-import-cache.service", however the timestamps show otherwise

$ journalctl -f -u boot.mount
-- Logs begin at Sat 2020-11-14 20:41:30 EST. --
Nov 19 22:20:45 test systemd[1]: Mounting /boot...
Nov 19 22:20:45 test systemd[1]: Mounted /boot.
-- Reboot --
Nov 19 22:21:34 test systemd[1]: Mounting /boot...
Nov 19 22:21:35 test systemd[1]: Mounted /boot.
-- Reboot --
Nov 20 16:19:07 test mount[1265]: filesystem 'bpool/BOOT/ubuntu_o9gk00' cannot be mounted, unable to open the dataset
Nov 20 16:19:07 test systemd[1]: boot.mount: Mount process exited, code=exited, status=1/FAILURE
Nov 20 16:19:07 test systemd[1]: boot.mount: Failed with result 'exit-code'.
Nov 20 16:19:07 test systemd[1]: Failed to mount /boot.
Nov 20 16:19:52 test systemd[1]: Mounting /boot...
Nov 20 16:19:52 test systemd[1]: Mounted /boot.

$ systemctl status zfs-import-cache.service
● zfs-import-cache.service - Import ZFS pools by cache file
     Loaded: loaded (/lib/systemd/system/zfs-import-cache.service; enabled; vendor preset: enabled)
     Active: active (exited) since Fri 2020-11-20 16:19:08 EST; 16min ago
       Docs: man:zpool(8)
   Main PID: 2084 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 18936)
     Memory: 0B
     CGroup: /system.slice/zfs-import-cache.service

Nov 20 16:19:08 test systemd[1]: Starting Import ZFS pools by cache file...
Nov 20 16:19:08 test systemd[1]: Finished Import ZFS pools by cache file.

So mount ran at 16:19:07 (and failed) yet zfs-import-cache didn't complete until 16:19:08. That's rather odd...

$ journalctl -f -u zfs-import-cache.service
-- Logs begin at Sat 2020-11-14 20:41:30 EST. --
Nov 19 22:21:34 test systemd[1]: Finished Import ZFS pools by cache file.
-- Reboot --
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found ordering cycle on zfs-import.target/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found dependency on zfs-import-cache.service/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found dependency on cryptsetup.target/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found dependency on <email address hidden>/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found dependency on systemd-random-seed.service/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Found dependency on var-lib.mount/start
Nov 20 16:19:07 test systemd[1]: var-lib.mount: Job zfs-import.target/start deleted to break ordering cycle starting with var-lib.mount/start
Nov 20 16:19:08 test systemd[1]: Starting Import ZFS pools by cache file...
Nov 20 16:19:08 test systemd[1]: Finished Import ZFS pools by cache file.

Now I see zfs-import-cache was "deleted" by systemd at 16:19:07 to break an ordering cycle, which I think explains the above - but this seems wrong?

If I then look at the prior zfs-import-cache log, I can see <email address hidden> was "deleted", thus enabling /boot to mount correctly:

Nov 19 22:21:34 test systemd[1]: cryptsetup.target:...

Read more...

Revision history for this message
Christopher Hill (ch6574) wrote :

So, long story short, I believe I have run into this issue https://github.com/zfsonlinux/pkg-zfs/issues/205

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zfs-linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Christopher Hill (ch6574) wrote :

Adding this bug as that seems to be the exact issue here - https://github.com/openzfs/zfs/issues/10140

Solution is "Use zfs-mount-generator instead of zfs-mount.service" ...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.