Xenial: ZFS deadlock in shrinker path with xattrs

Bug #1839521 reported by Mauricio Faria de Oliveira
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Fix Released
Medium
Mauricio Faria de Oliveira

Bug Description

[Impact]

 * Xenial's ZFS can deadlock in the memory shrinker path
   after removing files with extended attributes (xattr).

 * Extended attributes are enabled by default, but are
   _not_ used by default, which reduces the likelyhood.

 * It's very difficult/rare to reproduce this problem,
   due to file/xattr/remove/shrinker/lru order/timing
   circumstances required. (weeks for a reporter user)
   but a synthetic test-case has been found for tests.

[Test Case]

 * A synthetic reproducer is available for this LP,
   with a few steps to touch/setfattr/rm/drop_caches
   plus a kernel module to massage the disposal list.
   (comment #8)

 * In the original ZFS module:
   the xattr dir inode is not purged immediately on
   file removal, but possibly purged _two_ shrinker
   invocations later. This allows for other thread
   started before file remove to call zfs_zget() on
   the xattr child inode and iput() it, so it makes
   to the same disposal list as the xattr dir inode.
   (comment #3)

 * In the modified ZFS module:
   the xattr dir inode is purged immediately on file
   removal not possibly later on shrinker invocation,
   so the problem window above doesn't exist anymore.
   (comment #12)

[Regression Potential]

 * Low. The patches are confined to extended attributes
   in ZFS, specifically node removal/purge, and another
   change how an xattr child inode tracks its xattr dir
   (parent) inode, so that it can be purged immediately
   on removal.

 * The ZFS test-suite has been run on original/modified
   zfs-dkms package/kernel modules, with no regressions.
   (comment #11)

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (8.6 KiB)

[Original Description]

One LXC user reported lots of processes stuck in D state:
several threads waiting in the memory shrinker semaphore
(this symptom was thought to be/fixed via LP bug 1817628).

After some time, a provided crashdump revealed the issue
in ZFS's evict node path running in memory shrinker path
(thus locking the semaphore as observed previously/above).

The stack trace shows the inode memory shrinker entered
ZFS and is looping in zfs_zget().

PID: 42105 TASK: ffff881169f3d400 CPU: 36 COMMAND: "lxcfs"
 #0 [ffff88103ea88e38] crash_nmi_callback at ffffffff810518a7
 #1 [ffff88103ea88e48] nmi_handle at ffffffff810323ae
 #2 [ffff88103ea88ea0] default_do_nmi at ffffffff810328f4
 #3 [ffff88103ea88ec0] do_nmi at ffffffff81032aa2
 #4 [ffff88103ea88ee8] end_repeat_nmi at ffffffff8185a587
    [exception RIP: _raw_spin_lock+20]
    RIP: ffffffff81857464 RSP: ffff881a23bab138 RFLAGS: 00000246
    RAX: 0000000000000000 RBX: ffff8810a11afb78 RCX: ffff881e7ad76858
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8810a11afb78
    RBP: ffff881a23bab138 R8: 000000000001a6a0 R9: ffffffffc05e384a
    R10: ffffea0070071400 R11: ffff88014e96d340 R12: 0000000000000000
    R13: ffff8810a11afb50 R14: ffff88014e96d340 R15: ffff8810a11afaf8
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
 #5 [ffff881a23bab138] _raw_spin_lock at ffffffff81857464
 #6 [ffff881a23bab140] dbuf_read at ffffffffc08c141a [zfs]
 #7 [ffff881a23bab1e8] dnode_hold_impl at ffffffffc08db218 [zfs]
 #8 [ffff881a23bab250] dnode_hold at ffffffffc08db659 [zfs]
 #9 [ffff881a23bab260] dmu_bonus_hold at ffffffffc08ca2b6 [zfs]
#10 [ffff881a23bab2a0] sa_buf_hold at ffffffffc09023fe [zfs]
#11 [ffff881a23bab2b0] zfs_zget at ffffffffc095cb47 [zfs]
#12 [ffff881a23bab350] zfs_purgedir at ffffffffc093be54 [zfs]
#13 [ffff881a23bab558] zfs_rmnode at ffffffffc093c212 [zfs]
#14 [ffff881a23bab5a0] zfs_zinactive at ffffffffc095d2f8 [zfs]
#15 [ffff881a23bab5d8] zfs_inactive at ffffffffc0956671 [zfs]
#16 [ffff881a23bab628] zpl_evict_inode at ffffffffc096dc03 [zfs]
#17 [ffff881a23bab650] evict at ffffffff81233d81
#18 [ffff881a23bab678] dispose_list at ffffffff81233e86
#19 [ffff881a23bab690] prune_icache_sb at ffffffff81234fea
#20 [ffff881a23bab6c8] super_cache_scan at ffffffff8121b862
#21 [ffff881a23bab720] shrink_slab at ffffffff811a8e0d
#22 [ffff881a23bab800] shrink_zone at ffffffff811ad488
#23 [ffff881a23bab880] do_try_to_free_pages at ffffffff811ad5fb
#24 [ffff881a23bab900] try_to_free_pages at ffffffff811ad91e
#25 [ffff881a23bab980] __alloc_pages_slowpath.constprop.88 at ffffffff8119ee92
#26 [ffff881a23baba60] __alloc_pages_nodemask at ffffffff8119f908
#27 [ffff881a23babb00] alloc_pages_current at ffffffff811ea47c
#28 [ffff881a23babb48] alloc_kmem_pages at ffffffff8119d4d9
#29 [ffff881a23babb70] kmalloc_order_trace at ffffffff811bb04e
#30 [ffff881a23babbb0] __kmalloc at ffffffff811f6e90
#31 [ffff881a23babbf8] seq_buf_alloc at ffffffff8123ca00
#32 [ffff881a23babc10] single_open_size at ffffffff8123dc1a
#33 [ffff881a23babc50] stat_open at ffffffff8128fc76
#34 [ffff881a23babc68] proc_reg_open at ffffffff81286011
#35 [ffff881a23babca0] do_dentry_open ...

Read more...

Changed in zfs-linux (Ubuntu Eoan):
status: New → Invalid
Changed in zfs-linux (Ubuntu Disco):
status: New → Invalid
Changed in zfs-linux (Ubuntu Bionic):
status: New → Invalid
Changed in zfs-linux (Ubuntu Xenial):
status: New → In Progress
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Problem Summary:
---------------

If an xattr directory inode and its xattr child inode are on the _same_ disposal list,
and the xattr directory inode is _before_ its xattr child inode in this disposal list...

Then zfs_purgedir() of the xattr directory calls zfs_zget() for the xattr child inode
and it loops forever -- it can only stop if the xattr child inode is disposed/evicted,
but it could only occur _after_ in the disposal list and current list node is looping...

Because zfs_zget() gets non-NULL from dmu_buf_get_user() (which could go NULL only in
the ZFS evict path later in disposal list) so it goes to igrab() but that returns NULL
(because the inode.i_state got I_FREEING), then 'goto again:', which repeats that over.

Function path:

shrink_slab
- do_shrink_slab
  - shrinker->scan_objects == super_cache_scan
    - prune_icache_sb
      - list_sru_shrink_walk
        (creates disposal list with xattr dir&child inodes)
        - inode_lru_isolate(inode)
          - inode->i_state |= I_FREEING
            (problem for igrab of xattr child inode, below)
      - dispose_list
        - evict(xattr dir inode)
          - op->evict_inode == zpl_evict_inode
            - zfs_inactive
              - zfs_zinactive
                - zfs_rmnode
                  - zfs_purgedir
                    - zfs_zget (xattr child nodes)
                      - dmu_buf_get_user (non-NULL)
                      - igrab (NULL)
                      - goto again;
        ... thus never reaching ...
        - evict(xattr child inode)
          - op->evict_inode == zpl_evict_inode
            - zfs_inactive
              - zfs_zinactive
                - zfs_znode_dmu_fini
                  - sa_handle_destroy
                    - dmu_buf_remove_user
                      (not calling this yet is a problem for dmu_buf_get_user, above)
                      (this would make it return NULL and not go into the igrab call)

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (10.0 KiB)

Problem Details:
---------------

Helper scripts
(create /zfs.img and mount it on /zfs; and setup kprobe events for debug)

$ sudo ./zfs-mount.sh
$ sudo ./zfs-kprobes.sh

Print kprobe events to screen as we go:

$ sudo cat /sys/kernel/debug/tracing/trace_pipe &

Create file:
- allocates normal/file znode (flag=0x0)
- - its object number is obj=0x7
- - its znode pointer is zpp=0xffff8800a65f8000

$ touch /zfs/file
           <...>-20059 [000] d... 6718.949684: p_zfs_mknode_0: (zfs_mknode+0x0/0xe10 [zfs]) flag=0x0 dzp=0xffff8802115b0000
           touch-20059 [000] d... 6718.949791: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x560 [zfs]) obj=0x7
           touch-20059 [000] d... 6718.949806: r_zfs_znode_alloc_0: (zfs_mknode+0x8ae/0xe10 [zfs] <- zfs_znode_alloc) zpp=0xffff8800a65f8000

Set extended attribute on the file:
- allocates xattr directory znode (flag=0x2)
- - its parent znode is file znode (dzp=0xffff8800a65f8000)
- - its object number is obj=0x8
- - its znode pointer is zpp=0xffff8802111a8000

- allocates xattr znode (flag=0x0, inherits xattr bit from parent node)
- - its parent znode is xattr dir znode (dzp=0xffff8802111a8000)
- - its object number is obj=0x9
- - its znode pointer is zpp=0xffff8802111a8448

$ setfattr -n user.debug -v 1 /zfs/file
           <...>-31701 [004] d... 6770.933127: p_zfs_mknode_0: (zfs_mknode+0x0/0xe10 [zfs]) flag=0x2 dzp=0xffff8800a65f8000
           <...>-31701 [004] d... 6770.933287: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x560 [zfs]) obj=0x8
           <...>-31701 [004] d... 6770.933312: r_zfs_znode_alloc_0: (zfs_mknode+0x8ae/0xe10 [zfs] <- zfs_znode_alloc) zpp=0xffff8802111a8000

           <...>-31701 [004] d... 6770.933414: p_zfs_mknode_0: (zfs_mknode+0x0/0xe10 [zfs]) flag=0x0 dzp=0xffff8802111a8000
           <...>-31701 [004] d... 6770.933436: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x560 [zfs]) obj=0x9
        setfattr-31701 [004] d... 6770.933441: r_zfs_znode_alloc_0: (zfs_mknode+0x8ae/0xe10 [zfs] <- zfs_znode_alloc) zpp=0xffff8802111a8448

Remove file:
- Nothing more than zfs_zget() (i.e., "load to memory/get znode and inode for object number")
  on the file and xattr dir.
- No node removal yet (zfs_rmnode), nor its descendent functions.

$ rm /zfs/file
           <...>-5240 [000] d... 6796.826938: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8802353a2000 obj=0x7
           <...>-5240 [000] d... 6796.826967: r_zfs_zget_0: (zfs_dirent_lock+0x56c/0x6c0 [zfs] <- zfs_zget)

              rm-5240 [000] d... 6796.827023: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8802353a2000 obj=0x8
              rm-5240 [000] d... 6796.827030: r_zfs_zget_0: (zfs_remove+0x22b/0x4c0 [zfs] <- zfs_zget)

When dropping caches (e.g., inode LRU list)
- In one disposal list (i.e., call to dispose_list())
  - Evict/Dispose the xattr node (obj 0x9)
  - This iput()s its parent node (obj 0x8, the xattr dir node)
    thus dropping its last reference (allows it to be evicted)
    with zfs_iput_async().
- In another disposal list, before ZFS's async iput() task runs.
  - Evict/Dispose the xattr dir node (obj 0x8)
  - This iput()s its parent node (obj 0x7, the file node)
    thus dropping its la...

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Fix Details
-----------

Problem report (GH Issue) upstream in ZFS on Linux:
 https://github.com/zfsonlinux/zfs/issues/4816 ("Looping in zfs_zget()")

With very similar stack trace (syscall -> shrink -> zfs_{inactive/rmnode/purgedir/zget})
 https://github.com/zfsonlinux/zfs/issues/4816#issuecomment-229532796

Proposed patch in the Issue (not the right/final solution):
 https://github.com/zfsonlinux/zfs/issues/4816#issuecomment-229537558
 https://github.com/zfsonlinux/zfs/pull/4822 ("Allow zfs_purgedir() to skip inodes undergoing eviction")

However, that proposed patch is not the right/final solution:
 https://github.com/zfsonlinux/zfs/pull/4822#issuecomment-230938382
 https://github.com/zfsonlinux/zfs/pull/4822#issuecomment-232208224

Instead, these proposed patches are actually the right/final solution:
 https://github.com/zfsonlinux/zfs/pull/4827 ("xattr dir doesn't get purged during iput")

The two patches are merged upstream:
 https://github.com/zfsonlinux/zfs/pull/4827#issuecomment-232207013

 1) ddae16a xattr dir doesn't get purged during ipu
 2) 31b6111 Kill zp->z_xattr_parent to prevent pinning

Available in the 0.7.0 release:

 $ git describe --contains ddae16a 31b6111
 zfs-0.7.0-rc1~102
 zfs-0.7.0-rc1~101

Thus Ubuntu only needs these patches on Xenial (zfs-linux < 0.7.0):

 $ rmadison zfs-linux
  zfs-linux | 0.6.5.6-0ubuntu8 | xenial/universe | source
  zfs-linux | 0.6.5.6-0ubuntu27 | xenial-updates | source
  zfs-linux | 0.7.5-1ubuntu15 | bionic | source
  zfs-linux | 0.7.5-1ubuntu16.6 | bionic-updates | source
  zfs-linux | 0.7.12-1ubuntu5 | disco | source
  zfs-linux | 0.8.1-1ubuntu8 | eoan | source

There's a later commit that changes/"reverts" the
commit ddae16a ("xattr dir doesn't get purged during ipu"),
but it's not a bug fix, but rather a refactor/feature.

 https://github.com/zfsonlinux/zfs/pull/4838

With commit dfbc86309fd8 (below).

The commit ids are not mentioned in further commits upstream.

 $ git log --oneline --grep ddae16a ddae16a^..origin/master
 dfbc86309fd8 Use native inode->i_nlink instead of znode->z_links

 $ git log --oneline --grep 31b6111 ddae16a^..origin/master
 $

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Test-case for the problem (kmod-zget.c).
Header comments with steps to reproduce.

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

debdiff for zfs-linux (zfs-linux_shrinker-deadlock-xattr.debdiff)
based on ubuntu/xenial-devel branch from LP source pkg git repo [1].

[1] https://git.launchpad.net/ubuntu/+source/zfs-linux

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

zfs-test test suite on xenial
-----------------------------

Build/install the latest zfs-test suite from zfs.git:

$ git clone https://github.com/zfsonlinux/zfs.git
$ cd zfs
$ git log --oneline -1
c81f179 Metaslab max_size should be persisted while unloaded

$ sudo apt-get build-dep zfs-linux
$ sudo apt-get install libssl-dev alien libpython3-dev python3-setuptools python3-cffi

$ ./autogen.sh
$ ./configure
$ make -j$(nproc) pkg-utils
$ sudo dpkg -i zfs-test_0.8.0-170_amd64.deb

$ /usr/share/zfs/zfs-tests.sh # hit kernel errors in the tests below, removed them from 'linux.run'

 mixed_create_failure
  [ 390.511557] VERIFY(!RW_LOCK_HELD(&l->l_rwlock)) failed
  [ 390.516303] PANIC at zap.c:395:zap_leaf_pageout()

 zfs_clone_deeply_nested << too.
  very long stack traces detected in scheduler.

 zfs_upgrade_007_neg << hung tasks.
  shows twice in the file?!
  [ 1715.988411] VERIFY3(newds == os->os_dsl_dataset) failed (ffff8800b4e2a000 == ffff880064b83000)
  [ 1715.995255] PANIC at dmu_objset.c:618:dmu_objset_refresh_ownership()

 zpool_create_024_pos
  [ 572.926873] BUG: unable to handle kernel NULL pointer dereference at (null)

 import_cachefile_device_added
  [ 524.079638] PANIC: blkptr at ffff88009e9f8048 DVA 1 has invalid VDEV 1

 zpool_upgrade_007_pos (not zfs_upgrade_007_neg above)
  [ 480.801356] VERIFY3(newds == os->os_dsl_dataset) failed (ffff880203a98000 == ffff880203f75000)
  [ 480.808916] PANIC at dmu_objset.c:618:dmu_objset_refresh_ownership()

 enospc_003_pos
  [ 1201.448045] INFO: task txg_sync:1353 blocked for more than 120 seconds.
  [ 1201.456365] Tainted: P OE 4.4.0-159-generic #187-Ubuntu
  [ 1201.459650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [ 1201.464831] INFO: task file_write:1385 blocked for more than 120 seconds.
  [ 1201.468134] Tainted: P OE 4.4.0-159-generic #187-Ubuntu
  [ 1201.471387] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

RUN_FILE="/usr/share/zfs/runfiles/linux.run"
TMP_RUN_FILE="/tmp/$(basename $RUN_FILE)"
cp $RUN_FILE $TMP_RUN_FILE

for TEST in \
  mixed_create_failure \
  zfs_clone_deeply_nested \
  zfs_upgrade_007_neg \
  zpool_create_024_pos \
  import_cachefile_device_added \
  zpool_upgrade_007_pos \
  enospc_003_pos
do
  sed \
    -e "s:'$TEST',::" \
    -e "s:\( ,\)\?'$TEST'::" \
    -i $TMP_RUN_FILE
done

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

zfs-test results w/ original & modified zfs-dkms
------------------------------------------------

original
--------

$ uname -rv
4.4.0-159-generic #187-Ubuntu SMP Thu Aug 1 16:28:06 UTC 2019

$ modinfo zfs | head
filename: /lib/modules/4.4.0-159-generic/updates/dkms/zfs.ko
version: 0.6.5.6-0ubuntu27
...
srcversion: 0968F94158D646E259D86B5
...

$ /usr/share/zfs/zfs-tests.sh -r $TMP_RUN_FILE 2>&1 | tee zfs-tests.log.orig-kernel.orig-dkms

modified
--------

$ uname -rv
4.4.0-159-generic #187-Ubuntu SMP Thu Aug 1 16:28:06 UTC 2019

$ modinfo zfs | head
filename: /lib/modules/4.4.0-159-generic/updates/dkms/zfs.ko
version: 0.6.5.6-0ubuntu28
...
srcversion: 99F1D0FED2F291CA7AED0C6
...

$ /usr/share/zfs/zfs-tests.sh -r $TMP_RUN_FILE 2>&1 | tee zfs-tests.log.orig-kernel.test-dkms

Results Comparison
------------------

Filter test name and result only.

 $ awk '/^Test:/ { print $2, $7 }' zfs-tests.log.orig-kernel.orig-dkms > zfs-tests.log.orig-kernel.orig-dkms.awk
 $ awk '/^Test:/ { print $2, $7 }' zfs-tests.log.orig-kernel.test-dkms > zfs-tests.log.orig-kernel.test-dkms.awk

Example:

 $ head -n3 zfs-tests.log.orig-kernel.orig-dkms.awk
 /usr/share/zfs/zfs-tests/tests/functional/acl/posix/setup [PASS]
 /usr/share/zfs/zfs-tests/tests/functional/acl/posix/posix_001_pos [PASS]
 /usr/share/zfs/zfs-tests/tests/functional/acl/posix/posix_002_pos [PASS]

No differences/regressions between original zfs-dkms and modified zfs-dkms.

 $ diff zfs-tests.log.orig-kernel.{orig,test}-dkms.awk
 $

 $ grep -A3 Summary zfs-tests.log.orig-kernel.orig-dkms
 Results Summary
 FAIL 463
 PASS 739
 SKIP 184

 $ grep -A3 Summary zfs-tests.log.orig-kernel.test-dkms
 Results Summary
 SKIP 184
 PASS 739
 FAIL 463

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (5.4 KiB)

Fix Change Details
------------------

(Compare with comment #3)

The xattr dir/child inodes are evicted/disposed
immediately on file removal, not on shrinker.

$ touch /zfs/file
           <...>-23940 [006] d... 10808.746798: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff88011aa94400
           <...>-23940 [006] d... 10808.746925: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xa
           <...>-23940 [006] d... 10808.746938: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88011aa90880

$ setfattr -n user.debug -v 1 /zfs/file
           <...>-23941 [006] d... 10818.945733: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x2 dzp=0xffff88011aa90880
           <...>-23941 [006] d... 10818.945892: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xb
           <...>-23941 [006] d... 10818.945905: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88011aa961c0
           <...>-23941 [006] d... 10818.945925: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xa
        setfattr-23941 [006] d... 10818.945929: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-23941 [006] d... 10818.945933: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xa
        setfattr-23941 [006] d... 10818.945936: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-23941 [006] d... 10818.945948: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xa
        setfattr-23941 [006] d... 10818.945951: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-23941 [006] d... 10818.945966: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff88011aa961c0
        setfattr-23941 [006] d... 10818.945990: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xc
        setfattr-23941 [006] d... 10818.945996: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88011aa90440

$ rm /zfs/file
           <...>-23942 [006] d... 10822.453456: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xa
           <...>-23942 [006] d... 10822.453469: r_zfs_zget_0: (zfs_dirent_lock+0x56c/0x6c0 [zfs] <- zfs_zget)
              rm-23942 [006] d... 10822.453494: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xb
              rm-23942 [006] d... 10822.453498: r_zfs_zget_0: (zfs_remove+0x22b/0x4c0 [zfs] <- zfs_zget)
              rm-23942 [006] d... 10822.453597: p_zpl_evict_inode_0: (zpl_evict_inode+0x0/0x60 [zfs]) inode=0xffff88011aa90a90
              rm-23942 [006] d... 10822.453602: p_zfs_inactive_0: (zfs_inactive+0x0/0x270 [zfs]) inode=0xffff88011aa90a90
              rm-23942 [006] d... 10822.453603: p_zfs_zinactive_0: (zfs_zinactive+0x0/0xe0 [zfs]) znode=0xffff88011aa90880 obj=0xa
              rm-23942 [006] d... 10822.453607: p_zfs_rmnode_0: (zfs_rmnode+0x0/0x360 [zfs]) znode=0xffff88011aa90880
              rm-23942 [006] d... 10822.453619: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff880016f00000 obj=0xb
              rm-23942 [006] d... 10822.453623: r_zfs_zget_0: (zfs_rmnode+0x25...

Read more...

description: updated
Changed in linux (Ubuntu Eoan):
status: New → Invalid
Changed in linux (Ubuntu Disco):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
status: New → Invalid
Changed in linux (Ubuntu Xenial):
status: New → In Progress
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks Mauricio, I've sponsored this package, and sent the kernel related changes to the kernel-team mailing list: https://lists.ubuntu.com/archives/kernel-team/2019-August/102938.html

I also sanity checked the code with the ubuntu kernel team ZFS regression tests and they all pass successfully.

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Colin, thanks for sponsoring and additional testing!

Is this ZFS regression test suite available to other people?
It sounds great for future work with ZFS.

Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Mauricio, or anyone else affected,

Accepted zfs-linux into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/zfs-linux/0.6.5.6-0ubuntu28 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in zfs-linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (6.1 KiB)

Verification done for zfs-linux on xenial-proposed.

The inodes for file, xattr dir, and xattr child are all evicted at file removal time, not making it to any disposal list after file removal.

So the window/scenario for the problem to occur is not present anymore.

Log
---

$ echo 'deb http://archive.ubuntu.com/ubuntu xenial-proposed main restricted universe' | sudo tee /etc/apt/sources.list.d/xenial-proposed-universe.list
$ sudo apt-get update

$ apt-cache madison zfs-dkms | grep proposed
  zfs-dkms | 0.6.5.6-0ubuntu28 | http://archive.ubuntu.com/ubuntu xenial-proposed/universe amd64 Packages

$ sudo apt-get install -y spl-dkms zfs-dkms

$ modinfo zfs | head
filename: /lib/modules/4.4.0-157-generic/updates/dkms/zfs.ko
version: 0.6.5.6-0ubuntu28
...

$ sudo ./zfs-mount.sh

$ echo 2 | sudo tee /proc/sys/vm/drop_caches
2

$ sudo ./zfs-kprobes.sh

$ touch /zfs/file
           <...>-26587 [001] d... 1196.781161: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff88004ccfc000
           <...>-26587 [001] d... 1196.781306: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xa
           <...>-26587 [001] d... 1196.781336: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88004cd04cc0

$ setfattr -n user.debug -v 1 /zfs/file
        setfattr-26588 [000] d... 1199.095554: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x2 dzp=0xffff88004cd04cc0
        setfattr-26588 [000] d... 1199.095937: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xb
        setfattr-26588 [000] d... 1199.096145: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88004c567740
        setfattr-26588 [000] d... 1199.096539: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff88004d1c8000 obj=0xa
        setfattr-26588 [000] d... 1199.096779: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-26588 [000] d... 1199.097139: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff88004d1c8000 obj=0xa
        setfattr-26588 [000] d... 1199.097352: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-26588 [000] d... 1199.097661: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff88004d1c8000 obj=0xa
        setfattr-26588 [000] d... 1199.097892: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-26588 [000] d... 1199.098217: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff88004c567740
        setfattr-26588 [000] d... 1199.098449: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xc
        setfattr-26588 [000] d... 1199.098652: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff88004c567300

$ rm /zfs/file
           <...>-26589 [001] d... 1201.376695: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff88004d1c8000 obj=0xa
           <...>-26589 [001] d... 1201.376749: r_zfs_zget_0: (zfs_dirent_lock+0x56c/0x6c0 [zfs] <- zfs_zget)
              rm-26589 [001] d... 1201.376772: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff88004d1c8000 obj=0xb
              rm-26589 [001] d... 1201.376778: r_zfs_zget_0: (zfs_remove+0x22b/0x4c0 [...

Read more...

tags: added: verification-done-xenial
Mathew Hodson (mhodson)
no longer affects: linux (Ubuntu)
no longer affects: linux (Ubuntu Xenial)
no longer affects: linux (Ubuntu Bionic)
no longer affects: linux (Ubuntu Disco)
no longer affects: linux (Ubuntu Eoan)
no longer affects: zfs-linux (Ubuntu Bionic)
no longer affects: zfs-linux (Ubuntu Disco)
no longer affects: zfs-linux (Ubuntu Eoan)
Changed in zfs-linux (Ubuntu):
status: Invalid → Fix Released
importance: Undecided → Medium
Changed in zfs-linux (Ubuntu Xenial):
importance: Undecided → Medium
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi @mathew-hodson,

This fix is needed in the Linux kernel package for Xenial as well (it duplicates the zfs-linux/dkms source).

Shouldn't the 'linux' source package still be tracked?

Thanks,
Mauricio

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (6.0 KiB)

Verification done for linux on xenial-proposed.

The inodes for file, xattr dir, and xattr child are all evicted at file removal time, not making it to any disposal list after file removal.

So the window/scenario for the problem to occur is not present anymore.

Log
---

$ uname -rv
4.4.0-160-generic #188-Ubuntu SMP Wed Aug 14 04:21:43 UTC 2019

$ modinfo zfs | head
filename: /lib/modules/4.4.0-160-generic/kernel/zfs/zfs/zfs.ko
version: 0.6.5.6-0ubuntu28
...
srcversion: 99F1D0FED2F291CA7AED0C6

$ sudo apt-get install zfsutils-linux attr

$ sudo ./zfs-mount.sh

$ echo 2 | sudo tee /proc/sys/vm/drop_caches
2

$ sudo ./zfs-kprobes.sh

$ sudo cat /sys/kernel/debug/tracing/trace_pipe &

$ touch /zfs/file
           touch-10656 [001] d... 359.615887: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff8800b9875940
           touch-10656 [001] d... 359.616184: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xa
           touch-10656 [001] d... 359.616339: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff880036f48440

$ setfattr -n user.debug -v 1 /zfs/file
        setfattr-10657 [000] d... 361.507063: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x2 dzp=0xffff880036f48440
        setfattr-10657 [000] d... 361.507265: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xb
        setfattr-10657 [000] d... 361.507402: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff880139d09980
        setfattr-10657 [000] d... 361.507665: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8800bb12c000 obj=0xa
        setfattr-10657 [000] d... 361.507792: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-10657 [000] d... 361.507981: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8800bb12c000 obj=0xa
        setfattr-10657 [000] d... 361.508104: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-10657 [000] d... 361.508692: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8800bb12c000 obj=0xa
        setfattr-10657 [000] d... 361.508821: r_zfs_zget_0: (zfs_zaccess+0x12b/0x220 [zfs] <- zfs_zget)
        setfattr-10657 [000] d... 361.509022: p_zfs_mknode_0: (zfs_mknode+0x0/0xe00 [zfs]) flag=0x0 dzp=0xffff880139d09980
        setfattr-10657 [000] d... 361.509170: p_zfs_znode_alloc_0: (zfs_znode_alloc+0x0/0x520 [zfs]) obj=0xc
        setfattr-10657 [000] d... 361.509302: r_zfs_znode_alloc_0: (zfs_mknode+0x8a3/0xe00 [zfs] <- zfs_znode_alloc) zpp=0xffff880139d09100

$ rm /zfs/file
              rm-10658 [001] d... 363.216716: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8800bb12c000 obj=0xa
              rm-10658 [001] d... 363.216882: r_zfs_zget_0: (zfs_dirent_lock+0x56c/0x6c0 [zfs] <- zfs_zget)
              rm-10658 [001] d... 363.217130: p_zfs_zget_0: (zfs_zget+0x0/0x230 [zfs]) zsb=0xffff8800bb12c000 obj=0xb
              rm-10658 [001] d... 363.217271: r_zfs_zget_0: (zfs_remove+0x22b/0x4c0 [zfs] <- zfs_zget)
              rm-10658 [001] d... 363.217567: p_zpl_evict_inode_0: (zpl_evict_inode+0x0/0x60 [zfs]) inode=0xffff880036f48650
              rm-106...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zfs-linux - 0.6.5.6-0ubuntu28

---------------
zfs-linux (0.6.5.6-0ubuntu28) xenial; urgency=medium

  * Fix ZFS shrinker deadlock with xattrs (LP: #1839521)
    - Upstream ZFS fix 31b6111fd92a ("Kill zp->z_xattr_parent to prevent pinning")
      and ddae16a9cf0b ("xattr dir doesn't get purged during iput") fix a deadlock
      in shrinker path when a xattr directory inode and its xattr inode are in the
      same disposal list and the xattr dir inode is evicted before the xattr inode.

 -- Mauricio Faria de Oliveira <email address hidden> Wed, 07 Aug 2019 11:01:11 -0300

Changed in zfs-linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for zfs-linux has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.