Explanatory note on bionic-proposed (see comment #17): The flaky fail set had a long tail (tests that failed just a few times). Symptom: --- The logs show the failure cause is output mismatch, because '/target' failed to umount (busy), and it was logged; e.g., btrfs/027 [failed, exit status 1]- output mismatch (see /home/ubuntu/xfstests-dev/results//btrfs/027.out.bad) --- tests/btrfs/027.out 2020-10-26 22:23:54.626341276 +0000 +++ /home/ubuntu/xfstests-dev/results//btrfs/027.out.bad 2020-11-17 06:34:12.623548481 +0000 @@ -1,2 +1,3 @@ QA output created by 027 -Silence is golden +umount: /mnt/scratch: target is busy. +failed to unmount /dev/loop2 ... The number of failed tests increases with the number of umount failures. Comparing both numbers per log file: (line breaks added for clarity.) $ (grep ^Failed xfstests.log.2020-11-1*; \ grep -c 'umount: /mnt/scratch: target is busy' xfstests.log.2020-11-1* | grep -v :0) \ | sort xfstests.log.2020-11-16-21-03-32:Failed 31 of 823 tests // 30ish is normal failure count xfstests.log.2020-11-16-22-16-38:Failed 34 of 823 tests xfstests.log.2020-11-16-23-25-56:Failed 34 of 823 tests xfstests.log.2020-11-17-00-36-27:Failed 36 of 823 tests xfstests.log.2020-11-17-01-44-04:Failed 37 of 823 tests xfstests.log.2020-11-17-02-52-41:Failed 37 of 823 tests // things go bad after this. umount hung here. xfstests.log.2020-11-17-04-07-11:15 xfstests.log.2020-11-17-04-07-11:Failed 68 of 823 tests // umount hung here too. xfstests.log.2020-11-17-05-21-34:76 xfstests.log.2020-11-17-05-21-34:Failed 158 of 823 tests xfstests.log.2020-11-17-06-29-11:148 xfstests.log.2020-11-17-06-29-11:Failed 278 of 823 tests xfstests.log.2020-11-17-07-29-57:31 xfstests.log.2020-11-17-07-29-57:Failed 104 of 823 tests xfstests.log.2020-11-17-08-37-16:33 xfstests.log.2020-11-17-08-37-16:Failed 87 of 823 tests xfstests.log.2020-11-17-09-50-43:36 xfstests.log.2020-11-17-09-50-43:Failed 101 of 823 tests xfstests.log.2020-11-17-11-04-16:43 xfstests.log.2020-11-17-11-04-16:Failed 111 of 823 tests xfstests.log.2020-11-17-12-08-14:68 xfstests.log.2020-11-17-12-08-14:Failed 143 of 823 tests xfstests.log.2020-11-17-13-14-01:93 xfstests.log.2020-11-17-13-14-01:Failed 200 of 823 tests And the kernel log files show that 'umount' hung in the timestamps where the increase started. $ sudo grep umount kern.log.for.2020-11-1* kern.log.for.2020-11-17-02-52-41:Nov 17 03:52:55 mfo-s390x-bionic kernel: [24748.572259] CPU: 4 PID: 10862 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu kern.log.for.2020-11-17-02-52-41:Nov 17 03:53:10 mfo-s390x-bionic kernel: [24762.822089] CPU: 4 PID: 11705 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu kern.log.for.2020-11-17-02-52-41:Nov 17 03:53:16 mfo-s390x-bionic kernel: [24768.881926] CPU: 4 PID: 12110 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu kern.log.for.2020-11-17-04-07-11:Nov 17 03:52:55 mfo-s390x-bionic kernel: [24748.572259] CPU: 4 PID: 10862 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu kern.log.for.2020-11-17-04-07-11:Nov 17 03:53:10 mfo-s390x-bionic kernel: [24762.822089] CPU: 4 PID: 11705 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu kern.log.for.2020-11-17-04-07-11:Nov 17 03:53:16 mfo-s390x-bionic kernel: [24768.881926] CPU: 4 PID: 12110 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu This is the stack trace (s390x kernel pointer addresses.) 24748.572081] ------------[ cut here ]------------ [24748.572083] BTRFS: Transaction aborted (error -28) // i.e., -ENOSPC [24748.572217] WARNING: CPU: 4 PID: 10862 at /build/linux-9nQaV0/linux-4.15.0/fs/btrfs/extent-tree.c:3097 btrfs_run_delayed_refs+0x208/0x260 [btrfs] ... [24748.572259] CPU: 4 PID: 10862 Comm: umount Tainted: G W 4.15.0-125-generic #128-Ubuntu [24748.572259] Hardware name: IBM 2964 N63 400 (KVM/Linux) ... [24748.572297] Call Trace: [24748.572309] ([<000003ff8039710c>] btrfs_run_delayed_refs+0x204/0x260 [btrfs]) [24748.572322] [<000003ff803b0b46>] btrfs_commit_transaction+0x66/0x9e8 [btrfs] [24748.572330] [<00000000003cdf9e>] __sync_filesystem+0x56/0x80 [24748.572333] [<000000000038f088>] generic_shutdown_super+0x48/0x168 [24748.572334] [<000000000038f50e>] kill_anon_super+0x2e/0x40 [24748.572344] [<000003ff80375b60>] btrfs_kill_super+0x30/0x148 [btrfs] [24748.572346] [<000000000038fa90>] deactivate_locked_super+0x70/0xa0 [24748.572348] [<00000000003b6b94>] cleanup_mnt+0x64/0xa8 [24748.572351] [<000000000019a542>] task_work_run+0xba/0x100 [24748.572361] [<0000000000108a6e>] do_notify_resume+0x4e/0x60 [24748.572365] [<00000000008ff3be>] system_call+0xe2/0x2c8 After these errors, the failure rate increased; but not all times/tests; since it seems that at some point the umount finishes, unblocking tests. Cause: --- Since this issue didn't happen with the test/patched kernel, it would seem to originate from other/new patches. There are indeed 2 new ones not used in the test kernel: 6f64bf0873eb btrfs: qgroup: fix data leak caused by race between writeback and truncate <<< NEW d14eef17dcf2 btrfs: don't force read-only after error in drop snapshot <<< NEW $ ls -1r *.patch # test kernel on top of Ubuntu-4.15.0-122.124 0017-btrfs-ctree-check-key-order-before-merging-tree-bloc.patch 0016-btrfs-extent-tree-kill-the-BUG_ON-in-insert_inline_e.patch 0015-btrfs-extent-tree-kill-BUG_ON-in-__btrfs_free_extent.patch 0014-btrfs-extent_io-do-extra-check-for-extent-buffer-rea.patch 0013-btrfs-drop-unnecessary-offset_in_page-in-extent-buff.patch 0012-btrfs-use-BUG-instead-of-BUG_ON-1.patch 0011-btrfs-use-offset_in_page-instead-of-open-coding-it.patch 0010-btrfs-fix-wrong-address-when-faulting-in-pages-in-th.patch 0009-btrfs-fix-lockdep-splat-in-add_missing_dev.patch 0008-btrfs-require-only-sector-size-alignment-for-parent-.patch 0007-btrfs-fix-potential-deadlock-in-the-search-ioctl.patch 0006-uaccess-Add-non-pagefault-user-space-write-function.patch 0005-uaccess-Add-non-pagefault-user-space-read-functions.patch 0004-btrfs-set-the-lockdep-class-for-log-tree-extent-buff.patch 0003-btrfs-Remove-extraneous-extent_buffer_get-from-tree_.patch 0002-btrfs-Remove-redundant-extent_buffer_get-in-get_old_.patch 0001-btrfs-drop-path-before-adding-new-uuid-tree-entry.patch $ git log --oneline Ubuntu-4.15.0-122.124..Ubuntu-4.15.0-125.128 -- fs/btrfs/ a0a951aaa63f btrfs: ctree: check key order before merging tree blocks 0db62fbc1d08 btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() 4f0e0482c42a btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() 36b1133f57e3 btrfs: extent_io: do extra check for extent buffer read write functions 82e64a4e1ff8 btrfs: drop unnecessary offset_in_page in extent buffer helpers 3393e02f4100 btrfs: use BUG() instead of BUG_ON(1) 10679c69e0a8 btrfs: use offset_in_page instead of open-coding it 6f64bf0873eb btrfs: qgroup: fix data leak caused by race between writeback and truncate <<< NEW d14eef17dcf2 btrfs: don't force read-only after error in drop snapshot <<< NEW 2f54a800e2ae btrfs: fix wrong address when faulting in pages in the search ioctl 71ef83d6ab86 btrfs: fix lockdep splat in add_missing_dev cb1483e19a40 btrfs: require only sector size alignment for parent eb bytenr d738658f503d btrfs: fix potential deadlock in the search ioctl cca53f849a53 btrfs: set the lockdep class for log tree extent buffers bb11dbfbc405 btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind 9a4e81ea6a44 btrfs: Remove redundant extent_buffer_get in get_old_root 725ce0e39e6d btrfs: drop path before adding new uuid tree entry The commit message in 6f64bf0873eb explicitly describes the umount path (now in return to userspace, but still.) So it does look like a match: commit 6f64bf0873ebc15a4e710cb04c7630576637a845 Author: Qu Wenruo