zfs: importing zpool with vdev on zvol hangs kernel

Bug #1636517 reported by Fabian Grünbichler on 2016-10-25
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Colin Ian King
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Medium
Colin Ian King
zfs-linux (Ubuntu)
Medium
Colin Ian King
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Medium
Colin Ian King

Bug Description

[SRU Request][Xenial][Yakkety]

if a zvol of an existing, already imported zpool is a vdev of another zpool, a call to "zpool import" will everything zfs related. the stack trace is as follows:

[<ffffffffc038d374>] taskq_wait+0x74/0xe0 [spl]
[<ffffffffc038d42b>] taskq_destroy+0x4b/0x100 [spl]
[<ffffffffc04a4afd>] vdev_open_children+0x12d/0x180 [zfs]
[<ffffffffc04ae6cc>] vdev_root_open+0x3c/0xc0 [zfs]
[<ffffffffc04a45f5>] vdev_open+0xf5/0x4d0 [zfs]
[<ffffffffc048f11e>] spa_load+0x39e/0x1c60 [zfs]
[<ffffffffc049170d>] spa_tryimport+0xad/0x450 [zfs]
[<ffffffffc04c42d4>] zfs_ioc_pool_tryimport+0x64/0xa0 [zfs]
[<ffffffffc04c770b>] zfsdev_ioctl+0x44b/0x4e0 [zfs]
[<ffffffff8122124f>] do_vfs_ioctl+0x29f/0x490
[<ffffffff812214b9>] SyS_ioctl+0x79/0x90
[<ffffffff818318b2>] entry_SYSCALL_64_fastpath+0x16/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

[Fix]
zfsutils-linux:

Zesty: https://launchpadlibrarian.net/290907232/zfs-linux_0.6.5.8-0ubuntu4_0.6.5.8-0ubuntu5.diff.gz

Yakkety, likewise
Xenial, likewise

Sync'd fixes into kernel repos, patches in:
http://kernel.ubuntu.com/~cking/zfs-lp-1636517

[Regression Potential]

Minimal. This just touched one line in the zfs module module/zfs/zvol.cand a shim wrapper in include/linux/blkdev_compat.h

Tested and passes with the ubuntu kernel team autotest client zfs regression tests.

=================================================================

I traced this back to 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee (erged in 4.4.0-6.21), which added a second parameter to lookup_bdev without patching the zfs module (which needs to special case the vdev-on-zvol case, and uses this exact method only in this special casing code path).

attached you can find the output of "zfs send -R" ing such a zvol ("brokenvol.raw"), running "zfs receive POOL/TARGET < FILE" followed by "zpool import" should reproduce the hang.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-45-generic 4.4.0-45.66
ProcVersionSignature: Ubuntu 4.4.0-45.66-generic 4.4.21
Uname: Linux 4.4.0-45-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Oct 25 15:46 seq
 crw-rw---- 1 root audio 116, 33 Oct 25 15:46 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Tue Oct 25 15:49:51 2016
HibernationDevice: RESUME=/dev/mapper/xenial--vg-swap_1
InstallationDate: Installed on 2016-10-25 (0 days ago)
InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
PciMultimedia:

ProcFB: 0 qxldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-45-generic root=/dev/mapper/hostname--vg-root ro
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-45-generic N/A
 linux-backports-modules-4.4.0-45-generic N/A
 linux-firmware 1.157.4
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-2.7
dmi.modalias: dmi:bvnSeaBIOS:bvrrel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-2.7:cvnQEMU:ct1:cvrpc-i440fx-2.7:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-2.7
dmi.sys.vendor: QEMU

CVE References

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

attached patch with fix tested with 4.4.0-45.66

note that instead of hardcoding the patched variant of lookup_bdev, it might make sense to adapt the zfs automake files to autodetect and handle both one parameter and two parameter variants?

that way, all three variations of building the zfs module would work:

dkms source with either an Ubuntu or upstream kernel
Ubuntu kernel with zfs module source in module/zfs/

Colin Ian King (colin-king) wrote :

Thanks Fabian for tracking this bug down and the patch. I think the most flexible approach is to add the detection in to the two variants of the call and make ZFS build the correct way accordingly. I'll see what I can do on that.

The attachment "zfs-fix-zpool-import-bug-with-nested-pools.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Hajo Möller (dasjoe) wrote :

I proposed a patch making use of autoconf upstream at https://github.com/zfsonlinux/zfs/pull/5336

Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Changed in zfs-linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Colin Ian King (colin-king) wrote :

Thanks Hajo, I was working the fix last night with a detection of the two different lookup_bdev calls and been regression testing during the night for Zesty, Yakkety and Xenial. I've got my fix already prepared...

So the way this kind of fix works with Ubuntu is as follows:

1. Change goes into zfsutils-linux
2. Sync the delta into the kernel package

I've put the zfsutils updates in ppa:

https://launchpad.net/~colin-king/+archive/ubuntu/zfs-lp-1636517
sudo add-apt-repository ppa:colin-king/zfs-lp-1636517
sudo apt-get update && sudo apt-get upgrade

And the fix in some test kernels:

http://kernel.ubuntu.com/~cking/zfs-lp-163651

These have been tested against our internal zfs regression tests and pass.

@Fabian can you test these and if they are OK I'll SRU these. Thanks!

description: updated
Colin Ian King (colin-king) wrote :
description: updated

can confirm that the test packages correctly allow importing of such pools. thanks for the quick reaction!

minor nitpick since you referenced me in the changelog, please either spell my last name "Grünbichler" (with 'ü'), or transcribed with 'ue', and not with an 'i' - thanks! :)

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zfs-linux - 0.6.5.8-0ubuntu5

---------------
zfs-linux (0.6.5.8-0ubuntu5) zesty; urgency=low

  * Fix zpool import with vdev on zvol hang (LP: #1636517)
    kernel commit 77adfdaff1901ced4f72496e61317779424ee653
    ("block_dev: Support checking inode permissions in lookup_bdev()")
    added a flags argument to block_dev which caused this breakage. Add
    detection of 1 or 2 arg block_dev and add a zfs_block_dev shim to
    abstract these differences away. Kudos to Fabian Grünbichler for
    the original fix that this fix is based on.

 -- Colin Ian King <email address hidden> Tue, 25 Oct 2016 16:58:11 +0100

Changed in zfs-linux (Ubuntu Zesty):
status: New → Fix Released

Hello Fabian, or anyone else affected,

Accepted zfs-linux into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/zfs-linux/0.6.5.8-0ubuntu4.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in zfs-linux (Ubuntu Yakkety):
status: New → Fix Committed
Changed in zfs-linux (Ubuntu Xenial):
status: New → Fix Committed
Andy Whitcroft (apw) wrote :

Hello Fabian, or anyone else affected,

Accepted zfs-linux into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/zfs-linux/0.6.5.6-0ubuntu15 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Colin Ian King (colin-king) wrote :

Actually, hold off on testing until the kernel updates land with this fix.

Hajo Möller (dasjoe) wrote :

Colin, thank you for the fix, I will switch to xenial-proposed now.

As a followup, upstream merged my PR so 1014-kernel-lookup-bdev.patch may be removed once the next ZFS on Linux release gets synced to us, hopefully in time for zesty.

Luis Henriques (henrix) on 2016-11-08
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Luis Henriques (henrix) on 2016-11-08
Changed in linux (Ubuntu Yakkety):
status: New → Fix Committed
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety

issue does not occur anymore for xenial (Ubuntu-4.4.0-49.70)

tags: added: verification-done-xenial
removed: verification-needed-xenial
Colin Ian King (colin-king) wrote :

looks OK for yakkety in -proposed too.

tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zfs-linux - 0.6.5.6-0ubuntu15

---------------
zfs-linux (0.6.5.6-0ubuntu15) xenial; urgency=medium

  * Fix zpool import with vdev on zvol hang (LP: #1636517)
    Xenial kernel commit 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee
    ("block_dev: Support checking inode permissions in lookup_bdev()")
    added a flags argument to block_dev which caused this breakage. Add
    detection of 1 or 2 arg block_dev and add a zfs_block_dev shim to
    abstract these differences away. Kudos to Fabian Grünbichler for
    the original fix that this fix is based on.

 -- Colin Ian King <email address hidden> Tue, 25 Oct 2016 16:58:11 +0100

Changed in zfs-linux (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for zfs-linux has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zfs-linux - 0.6.5.8-0ubuntu4.1

---------------
zfs-linux (0.6.5.8-0ubuntu4.1) yakkety; urgency=low

  * Fix zpool import with vdev on zvol hang (LP: #1636517)
    kernel commit 77adfdaff1901ced4f72496e61317779424ee653
    ("block_dev: Support checking inode permissions in lookup_bdev()")
    added a flags argument to block_dev which caused this breakage. Add
    detection of 1 or 2 arg block_dev and add a zfs_block_dev shim to
    abstract these differences away. Kudos to Fabian Grünbichler for
    the original fix that this fix is based on.

 -- Colin Ian King <email address hidden> Tue, 25 Oct 2016 16:58:11 +0100

Changed in zfs-linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (22.5 KiB)

This bug was fixed in the package linux - 4.4.0-51.72

---------------
linux (4.4.0-51.72) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1644611

  * 4.4.0-1037-snapdragon #41: kernel panic on boot (LP: #1644596)
    - Revert "dma-mapping: introduce the DMA_ATTR_NO_WARN attribute"
    - Revert "powerpc: implement the DMA_ATTR_NO_WARN attribute"
    - Revert "nvme: use the DMA_ATTR_NO_WARN attribute"

linux (4.4.0-50.71) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1644169

  * xenial 4.4.0-49.70 kernel breaks LXD userspace (LP: #1644165)
    - Revert "UBUNTU: SAUCE: (namespace) fuse: Allow user namespace mounts by
      default"
    - Revert "UBUNTU: SAUCE: (namespace) fs: Don't remove suid for CAP_FSETID for
      userns root"
    - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Don't remove suid for
      CAP_FSETID in s_user_ns""
    - Revert "UBUNTU: SAUCE: (namespace) fs: Allow superblock owner to change
      ownership of inodes"
    - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Allow superblock owner to
      change ownership of inodes with unmappable ids""
    - Revert "UBUNTU: SAUCE: (namespace) security/integrity: Harden against
      malformed xattrs"
    - Revert "(namespace) Revert "UBUNTU: SAUCE: ima/evm: Allow root in s_user_ns
      to set xattrs""
    - Revert "(namespace) dquot: For now explicitly don't support filesystems
      outside of init_user_ns"
    - Revert "(namespace) quota: Handle quota data stored in s_user_ns in
      quota_setxquota"
    - Revert "(namespace) quota: Ensure qids map to the filesystem"
    - Revert "(namespace) Revert "UBUNTU: SAUCE: quota: Convert ids relative to
      s_user_ns""
    - Revert "(namespace) Revert "UBUNTU: SAUCE: quota: Require that qids passed
      to dqget() be valid and map into s_user_ns""
    - Revert "(namespace) vfs: Don't create inodes with a uid or gid unknown to
      the vfs"
    - Revert "(namespace) vfs: Don't modify inodes with a uid or gid unknown to
      the vfs"
    - Revert "UBUNTU: SAUCE: (namespace) fuse: Translate ids in posix acl xattrs"
    - Revert "UBUNTU: SAUCE: (namespace) posix_acl: Export
      posix_acl_fix_xattr_userns() to modules"
    - Revert "(namespace) vfs: Verify acls are valid within superblock's
      s_user_ns."
    - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Update posix_acl support to
      handle user namespace mounts""
    - Revert "(namespace) fs: Refuse uid/gid changes which don't map into
      s_user_ns"
    - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Refuse uid/gid changes which
      don't map into s_user_ns""
    - Revert "(namespace) mnt: Move the FS_USERNS_MOUNT check into sget_userns"

linux (4.4.0-49.70) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1640921

  * Infiniband driver (kernel module) needed for Azure (LP: #1641139)
    - SAUCE: RDMA Infiniband for Windows Azure
    - [Config] CONFIG_HYPERV_INFINIBAND_ND=m
    - SAUCE: Makefile RDMA infiniband driver for Windows Azure
    - [Config] Add hv_network_direct.ko to generic inclusion list
    - SAUCE: RDMA Infiniband for Windows Azure is dependent on amd64...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (26.6 KiB)

This bug was fixed in the package linux - 4.8.0-28.30

---------------
linux (4.8.0-28.30) yakkety; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1641083

  * lxc-attach to malicious container allows access to host (LP: #1639345)
    - Revert "UBUNTU: SAUCE: (noup) ptrace: being capable wrt a process requires
      mapped uids/gids"
    - (upstream) mm: Add a user_ns owner to mm_struct and fix ptrace permission
      checks

  * [Feature] AVX-512 new instruction sets (avx512_4vnniw, avx512_4fmaps)
    (LP: #1637526)
    - x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features

  * zfs: importing zpool with vdev on zvol hangs kernel (LP: #1636517)
    - SAUCE: (noup) Update zfs to 0.6.5.8-0ubuntu4.1

  * Move some device drivers build from kernel built-in to modules
    (LP: #1637303)
    - [Config] CONFIG_TIGON3=m for all arches
    - [Config] CONFIG_VIRTIO_BLK=m, CONFIG_VIRTIO_NET=m

  * I2C touchpad does not work on AMD platform (LP: #1612006)
    - pinctrl/amd: Configure GPIO register using BIOS settings

  * guest experiencing Transmit Timeouts on CX4 (LP: #1636330)
    - powerpc/64: Re-fix race condition between going idle and entering guest
    - powerpc/64: Fix race condition in setting lock bit in idle/wakeup code

  * QEMU throws failure msg while booting guest with SRIOV VF (LP: #1630554)
    - KVM: PPC: Always select KVM_VFIO, plus Makefile cleanup

  * [Feature] KBL - New device ID for Kabypoint(KbP) (LP: #1591618)
    - SAUCE: mfd: lpss: Fix Intel Kaby Lake PCH-H properties

  * hio: SSD data corruption under stress test (LP: #1638700)
    - SAUCE: hio: set bi_error field to signal an I/O error on a BIO
    - SAUCE: hio: splitting bio in the entry of .make_request_fn

  * cleanup primary tree for linux-hwe layering issues (LP: #1637473)
    - [Config] switch Vcs-Git: to yakkety repository
    - [Packaging] handle both linux-lts* and linux-hwe* as backports
    - [Config] linux-tools-common and linux-cloud-tools-common are one per series
    - [Config] linux-source-* is in the primary linux namespace
    - [Config] linux-tools -- always suggest the base package

  * SRU: sync zfsutils-linux and spl-linux changes to linux (LP: #1635656)
    - SAUCE: (noup) Update spl to 0.6.5.8-2, zfs to 0.6.5.8-0ubuntu4 (LP:
      #1635656)

  * [Feature] SKX: perf uncore PMU support (LP: #1591810)
    - perf/x86/intel/uncore: Add Skylake server uncore support
    - perf/x86/intel/uncore: Remove hard-coded implementation for Node ID mapping
      location
    - perf/x86/intel/uncore: Handle non-standard counter offset

  * [Feature] Purley: Memory Protection Keys (LP: #1591804)
    - x86/pkeys: Add fault handling for PF_PK page fault bit
    - mm: Implement new pkey_mprotect() system call
    - x86/pkeys: Make mprotect_key() mask off additional vm_flags
    - x86/pkeys: Allocation/free syscalls
    - x86: Wire up protection keys system calls
    - generic syscalls: Wire up memory protection keys syscalls
    - pkeys: Add details of system call use to Documentation/
    - x86/pkeys: Default to a restrictive init PKRU
    - x86/pkeys: Allow configuration of init_pkru
    - x86/pkeys: Add self-tests

  * kernel invalid ...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.8.0-30.32

---------------
linux (4.8.0-30.32) yakkety; urgency=low

  * CVE-2016-8655 (LP: #1646318)
    - packet: fix race condition in packet_set_ring

 -- Brad Figg <email address hidden> Thu, 01 Dec 2016 08:02:53 -0800

Changed in linux (Ubuntu Zesty):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers