zfs-initramfs wont mount rpool

Bug #1838278 reported by Ryan Harper on 2019-07-29
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin (Ubuntu)
Medium
Unassigned
zfs-linux (Ubuntu)
Undecided
Unassigned

Bug Description

1. Eoan

2. http://archive.ubuntu.com/ubuntu eoan/main amd64 zfs-initramfs amd64 0.8.1-1ubuntu7 [23.1 kB]

3. ZFS rootfs rpool is mounted at boot

4. Booting an image with a rootfs rpool:

[ 0.000000] Linux version 5.2.0-8-generic (buildd@lgw01-amd64-015) (gcc version 9.1.0 (Ubuntu 9.1.0-6ubuntu2)) #9-Ubuntu SMP Mon Jul 8 13:07:27 UTC 2019 (Ubuntu 5.2.0-8.9-generic 5.2.0)
[ 0.000000] Command line: BOOT_IMAGE=/ROOT/zfsroot@/boot/vmlinuz-5.2.0-8-generic root=ZFS=rpool/ROOT/zfsroot ro console=ttyS0

Command: /sbin/zpool import -N 'rpool'
Message: cannot import 'rpool': pool was previously in use from another system.
Last accessed by ubuntu (hostid=d24775ba) at Mon Jul 29 05:21:19 2019
The pool can be imported, use 'zpool import -f' to import the pool.
Error: 1

Failed to import pool 'rpool'.
Manually import the pool and exit.

Note, this works fine under Disco,

http://archive.ubuntu.com/ubuntu disco/main amd64 zfs-initramfs amd64 0.7.12-1ubuntu5 [22.2 kB]

[ 4.773077] spl: loading out-of-tree module taints kernel.
[ 4.777256] SPL: Loaded module v0.7.12-1ubuntu3
[ 4.779433] znvpair: module license 'CDDL' taints kernel.
[ 4.780333] Disabling lock debugging due to kernel taint
[ 5.713830] ZFS: Loaded module v0.7.12-1ubuntu5, ZFS pool version 5000, ZFS filesystem version 5
Begin: Sleeping for ... done.
Begin: Importing ZFS root pool 'rpool' ... Begin: Importing pool 'rpool' using defaults ... done.
Begin: Mounting 'rpool/ROOT/zfsroot' on '/root//' ... done.

Related branches

tags: added: id-5d79bf699cc0451cfad6b4a9
Didier Roche (didrocks) wrote :

The issue seems to be related to a change in ZFS 0.8 initramfs script.

The initramfs script for ZFS does a normal ZFS import.
ZFS import now forces to export a pool before importing it back again on a different system. This is a security feature to ensure the same pool isn't imported on two different systems on the same time.

I guess what happens is that the way you are installing the pool doesn't export it before the reboot, then you reboot to the new installed system (which has different ID), and so zfs import fails in the initramfs.

We were wondering if we should import -f in the initramfs to force importing, that's a question for Colin K. I think?
At least, we should ensure you find what didn't export the pool properly.

Ryan Harper (raharper) wrote :

Curtin hasn't ever run zfs export on the pools; so either something else did this previously, or it wasn't a requirement.

I can see if adding a zfs export on the pool works around the issue.

Ryan Harper (raharper) wrote :

A quick hack shows that if we export after unmount. I'd like to understand if we we need/should use import -f, however, curtin can now ensure it exports pools it has created at the end of install.

Changed in curtin (Ubuntu):
importance: Undecided → Medium
status: New → In Progress
Didier Roche (didrocks) wrote :

Marking the zfs-linux task as won't fix after looking more deeply about cause/consequences of forcing -f on every boot:
- zfs 0.8, as told previously, tag with which system the pool was associated with and refuse to import previously unexported pool, as they can still be attached to any systems (possibly running).
- there is a kernel option zfs.force=on (or _, '') which can be set to on/yes/1 to force the import in the initramfs.

This is seen upstream as a way to force broken systems, where they have been imported but not exported before reboot.

Note that this broken case only impacts the 2 following scenarios:
- you install a new system (so system id != final id) and then reboot to your new installed system. This is the curtin (and ubiquity) cases. I think it's fine to require them to properly export the pools before rebooting (which will cause a sync).
- you have 2 systems installed in parallel on the same pool, and on shutdown, while switching between the 2 systems, the export wasn't working on shutdown. This has to be seen how frequent this is and having zfs marked as experimental for this cycle sounds like a good fit to get those data.

Marking the zfs task as won't fix for now.

Changed in zfs-linux (Ubuntu):
status: New → Won't Fix

This bug is fixed with commit 489c0145 to curtin on branch master.
To view that commit see the following URL:
https://git.launchpad.net/curtin/commit/?id=489c0145

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package curtin - 19.2-44-g8e618b34-0ubuntu1

---------------
curtin (19.2-44-g8e618b34-0ubuntu1) focal; urgency=medium

  * New upstream snapshot.
    - t/jenkins-runner: replace $EPOCHSECONDS with 'date +%s' [Paride Legovini]
    - curthooks: skip setup_kernel_img_conf on eoan and newer (LP: #1847257)
    - block_meta: use lookup for wwn, fix fallback from wwn, serial, path
      (LP: #1849322)
    - vmtest: Adjust TestScsiBasic to use dnames to find correct disk
    - schema: Add ptable value 'unsupported' (LP: #1848535)
    - tools/xkvm: add -nographic to speed up devopt query
    - test_block_dasd: fix random_device_id to only generate valid IDs
      (LP: #1849549)
    - vmtest: update skip_if_arch message
    - Add skip_by_date to eoan ipv6 vlan test
    - storage_config: interpret value, not presence, of
      DM_MULTIPATH_DEVICE_PATH [Michael Hudson-Doyle]
    - vmtest: Add skip_by_date for test_ip_output on eoan + vlans
    - block-schema: update raid schema for preserve and metadata
    - dasd: update partition table value to 'vtoc' (LP: #1847073)
    - clear-holders: increase the level for devices with holders by one
      (LP: #1844543)
    - tests: mock timestamp used in collect-log file creation (LP: #1847138)
    - ChrootableTarget: mount /run to resolve lvm/mdadm issues which
      require it.
    - block-discover: handle multipath disks (LP: #1839915)
    - Handle partial raid on partitions (LP: #1835091)
    - install: export zpools if present in the storage-config (LP: #1838278)
    - block-schema: allow 'mac' as partition table type (LP: #1845611)
    - jenkins-runner: disable the lockfile timeout by default [Paride Legovini]
    - curthooks: use correct grub-efi package name on i386 (LP: #1845914)
    - vmtest-sync-images: remove unused imports [Paride Legovini]
    - vmtests: use file locking on the images [Paride Legovini]
    - vmtest: enable arm64 [Paride Legovini]
    - Make the vmtests/test_basic test suite run on ppc64el [Paride Legovini]
    - vmtests: separate arch and target_arch in tests [Paride Legovini]
    - vmtests: new decorator: skip_if_arch [Paride Legovini]
    - vmtests: increase the VM memory for Bionic
    - vmtests: Skip Eoan ZFS Root tests until bug fix is complete
    - Merge branch 'fix_merge_conflicts'
    - util: add support for 'tbz', 'txz' tar format types to sanitize_source
      (LP: #1843266)
    - net: ensure eni helper tools install if given netplan config
      (LP: #1834751)
    - d/control: update Depends for new probert package names
      [Dimitri John Ledkov]

 -- Ryan Harper <email address hidden> Fri, 01 Nov 2019 14:06:13 -0500

Changed in curtin (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers