Backport ZoL pull request 9203 into the official packages.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
zfs-linux (Ubuntu) |
Fix Released
|
High
|
Colin Ian King | ||
Eoan |
Fix Released
|
High
|
Unassigned |
Bug Description
== SRU Justification, EOAN ==
ZFS can deadlock, this can be sometimes triggered with a zfs rollback - "the zfs_resume_fs() code path may cause zfs to spawn new threads as it reinstantiates the suspended fs's zil. When a new thread is spawned, the kernel may attempt to free memory for that thread by freeing some unreferenced inodes. If it happens to select inodes that are a a part of the suspended fs a deadlock will occur because freeing inodes requires holding the fs's z_teardown_
== The Fix ==
Backport of ZFS upstream commit e7a2fa70c3b0d8c
The backport is relatively simple context wiggle.
== Test Case ==
This is hard to trigger so testing is non-trivial. To check for regressions we run the entire Ubuntu ZFS regression test suite. Without the fix rollbacks can very occasionally trip this issue. With the test, it's not possible.
== Regression Potential ==
The fix adds in an extra z_suspended flag to track suspended state and adds an extra reference to stop the kernel from free'ing inodes on a suspected file system. The changes are small and are well-used in upstream ZFS so I believe if a regression was to have occurred it would have been found by the regression testing.
-------
Hopefully this is the correct bug tracker to report this on.
Recently, I ran into a bug in ZFS as shipped in Ubuntu 19.10 that caused the kernel to deadlock and the system to eventually hang.
I was advised by a ZFS on Linux project maintainer that this was a bug that was fixed in 0.8.2.
The relevant pull request is here: https:/
It would probably be a good idea to backport that pull request into 19.10's build of ZFS.
description: | updated |
Changed in zfs-linux (Ubuntu): | |
status: | In Progress → Fix Released |
Changed in zfs-linux (Ubuntu Eoan): | |
importance: | Undecided → High |
The fix in question is:
commit e7a2fa70c3b0d8c 8cee2b484038bb5 623c7c1ea9
Author: Tom Caputi <email address hidden>
Date: Tue Aug 27 12:55:51 2019 -0400
Fix deadlock in 'zfs rollback'
Currently, the 'zfs rollback' code can end up deadlocked due to inactive_ lock which is still held from the suspend.
the way the kernel handles unreferenced inodes on a suspended fs.
Essentially, the zfs_resume_fs() code path may cause zfs to spawn
new threads as it reinstantiates the suspended fs's zil. When a
new thread is spawned, the kernel may attempt to free memory for
that thread by freeing some unreferenced inodes. If it happens to
select inodes that are a a part of the suspended fs a deadlock
will occur because freeing inodes requires holding the fs's
z_teardown_
This patch corrects this issue by adding an additional reference
to all inodes that are still present when a suspend is initiated.
This prevents them from being freed by the kernel for any reason.
Reviewed-by: Alek Pinchuk <email address hidden>
Reviewed-by: Brian Behlendorf <email address hidden>
Signed-off-by: Tom Caputi <email address hidden>
Closes #9203