Kernel error "task zfs:pid blocked for more than 120 seconds"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Linux |
Fix Released
|
Unknown
|
||
| linux (Ubuntu) |
High
|
Colin Ian King | ||
| Xenial |
High
|
Colin Ian King | ||
| Bionic |
High
|
Colin Ian King | ||
| Cosmic |
High
|
Colin Ian King | ||
| zfs-linux (Ubuntu) |
High
|
Colin Ian King | ||
| Xenial |
High
|
Colin Ian King | ||
| Bionic |
High
|
Colin Ian King | ||
| Cosmic |
High
|
Colin Ian King |
Bug Description
== SRU Justification, XENIAL, BIONIC ==
Exercising ZFS with lxd with many mount/umounts can cause lockups and 120 second timeout messages.
== How to reproduce bug ==
In a VM, 2 CPUs, 16GB of memory running Bionic:
sudo apt update
sudo apt install lxd lxd-client lxd-tools zfsutils-linux
sudo lxd init
(and with the default init options)
then run:
lxd-benchmark launch --count 96 --parallel 96
This will reliably show the lockup every time without the fix. With the fix (detailed below) one cannot reproduce the lockup.
== Fix ==
Upstream ZFS commit
commit ac09630d8b0bf6c
Author: Brian Behlendorf <email address hidden>
Date: Wed Jul 11 15:49:10 2018 -0700
Fix zpl_mount() deadlock
== Regression Potential ==
This just changes the locking in the mount path of ZFS and will only affect ZFS mount/unmounts. The regression potential is small as this touches a very small code path that has been exhaustively exercises this code under multiple thread/CPU contention and shown not to break.
------------------
ZFS bug report: https:/
"I am using LXD containers that are configured to use a ZFS storage backend.
I create many containers using a benchmark tool, which probably stresses the use of ZFS.
In two out of four attempts, I got
[ 725.970508] INFO: task lxd:4455 blocked for more than 120 seconds.
[ 725.976730] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 725.983551] "echo 0 > /proc/sys/
[ 725.991624] INFO: task txg_sync:4202 blocked for more than 120 seconds.
[ 725.998264] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.005071] "echo 0 > /proc/sys/
[ 726.013313] INFO: task lxd:99919 blocked for more than 120 seconds.
[ 726.019609] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.026418] "echo 0 > /proc/sys/
[ 726.034560] INFO: task zfs:100513 blocked for more than 120 seconds.
[ 726.040936] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.047746] "echo 0 > /proc/sys/
[ 726.055791] INFO: task zfs:100584 blocked for more than 120 seconds.
[ 726.062170] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.068979] "echo 0 > /proc/sys/
Describe how to reproduce the problem
Start an Ubuntu 18.04 LTS server.
Install LXD if not already installed.
sudo apt update
sudo apt install lxd lxd-client lxd-tools zfsutils-linux
Configure LXD with sudo lxd init. When prompted for the storage backend, select ZFS and specify an empty disk.
$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]:
Do you want to configure a new storage pool? (yes/no) [default=yes]:
Name of the new storage pool [default=default]:
Name of the storage backend to use (dir, zfs) [default=zfs]:
Create a new ZFS pool? (yes/no) [default=yes]:
Would you like to use an existing block device? (yes/no) [default=no]: yes
Path to the existing block device: /dev/sdb
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to create a new local network bridge? (yes/no) [default=yes]: no
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: no
Would you like LXD to be available over the network? (yes/no) [default=no]:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:
Now run the following to launch 48 containers in batches of 12.
lxd-benchmark launch --count 48 --parallel 12
In two out of four attempts, I got the kernel errors.
I also tried
echo 1 >/sys/module/
but did not manage to continue.
Include any warning/
dmesg output
[ 725.970508] INFO: task lxd:4455 blocked for more than 120 seconds.
[ 725.976730] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 725.983551] "echo 0 > /proc/sys/
[ 725.991408] lxd D 0 4455 1 0x00000000
[ 725.991412] Call Trace:
[ 725.991424] __schedule+
[ 725.991428] schedule+0x2c/0x80
[ 725.991429] rwsem_down_
[ 725.991460] ? dbuf_rele_
[ 725.991465] call_rwsem_
[ 725.991468] ? call_rwsem_
[ 725.991469] down_write+
[ 725.991472] grab_super+
[ 725.991501] ? zpl_create+
[ 725.991504] sget_userns+
[ 725.991507] ? get_anon_
[ 725.991534] ? zpl_create+
[ 725.991537] sget+0x7d/0xa0
[ 725.991540] ? get_anon_
[ 725.991567] zpl_mount+
[ 725.991570] mount_fs+0x37/0x150
[ 725.991574] vfs_kern_
[ 725.991576] do_mount+
[ 725.991577] ? copy_mount_
[ 725.991578] SyS_mount+0x98/0xe0
[ 725.991582] do_syscall_
[ 725.991583] entry_SYSCALL_
[ 725.991585] RIP: 0033:0x4dbd5a
[ 725.991586] RSP: 002b:000000c428
[ 725.991588] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004dbd5a
[ 725.991589] RDX: 000000c421a04b7c RSI: 000000c426f94f40 RDI: 000000c4274ceaa0
[ 725.991590] RBP: 000000c428be6930 R08: 000000c425521a90 R09: 0000000000000000
[ 725.991590] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
[ 725.991591] R13: 000000000000003e R14: 000000000000003d R15: 0000000000000080
[ 725.991624] INFO: task txg_sync:4202 blocked for more than 120 seconds.
[ 725.998264] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.005071] "echo 0 > /proc/sys/
[ 726.012928] txg_sync D 0 4202 2 0x80000000
[ 726.012930] Call Trace:
[ 726.012933] __schedule+
[ 726.012939] schedule+0x2c/0x80
[ 726.012945] cv_wait_
[ 726.012948] ? wait_woken+
[ 726.012954] __cv_wait+0x15/0x20 [spl]
[ 726.012981] rrw_enter_
[ 726.013006] rrw_enter+0x13/0x20 [zfs]
[ 726.013033] spa_sync+
[ 726.013062] txg_sync_
[ 726.013089] ? txg_quiesce_
[ 726.013093] thread_
[ 726.013098] kthread+0x121/0x140
[ 726.013101] ? __thread_
[ 726.013103] ? kthread_
[ 726.013104] ret_from_
[ 726.013313] INFO: task lxd:99919 blocked for more than 120 seconds.
[ 726.019609] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.026418] "echo 0 > /proc/sys/
[ 726.034272] lxd D 0 99919 99626 0x00000000
[ 726.034274] Call Trace:
[ 726.034277] __schedule+
[ 726.034283] ? __wake_
[ 726.034286] schedule+0x2c/0x80
[ 726.034290] cv_wait_
[ 726.034293] ? wait_woken+
[ 726.034297] __cv_wait+0x15/0x20 [spl]
[ 726.034322] txg_wait_
[ 726.034349] zil_create+
[ 726.034376] zil_commit_
[ 726.034401] zil_commit.
[ 726.034429] zil_commit+
[ 726.034457] zfs_sync+0x6e/0xb0 [zfs]
[ 726.034484] zpl_sync_
[ 726.034490] __sync_
[ 726.034493] sync_filesystem
[ 726.034495] generic_
[ 726.034496] kill_anon_
[ 726.034518] zpl_kill_
[ 726.034524] deactivate_
[ 726.034529] deactivate_
[ 726.034532] cleanup_
[ 726.034534] __cleanup_
[ 726.034535] task_work_
[ 726.034537] exit_to_
[ 726.034539] do_syscall_
[ 726.034542] entry_SYSCALL_
[ 726.034550] RIP: 0033:0x7fb553b3e8c7
[ 726.034551] RSP: 002b:00007fff42
[ 726.034553] RAX: 0000000000000000 RBX: 000000000000000f RCX: 00007fb553b3e8c7
[ 726.034553] RDX: 00007fb55476eb9f RSI: 0000000000000002 RDI: 00007fb554770b6b
[ 726.034554] RBP: 000000000000000c R08: 0000000000000000 R09: 00007fb553b8ae67
[ 726.034555] R10: 0000000000084000 R11: 0000000000000246 R12: 00007fff426c64a0
[ 726.034555] R13: 0000000003176690 R14: 0000000003177810 R15: 00000000031741f0
[ 726.034560] INFO: task zfs:100513 blocked for more than 120 seconds.
[ 726.040936] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.047746] "echo 0 > /proc/sys/
[ 726.055600] zfs D 0 100513 2626 0x00000000
[ 726.055602] Call Trace:
[ 726.055606] __schedule+
[ 726.055609] schedule+0x2c/0x80
[ 726.055613] cv_wait_
[ 726.055615] ? wait_woken+
[ 726.055619] __cv_wait+0x15/0x20 [spl]
[ 726.055642] rrw_enter_
[ 726.055666] rrw_enter+0x1c/0x20 [zfs]
[ 726.055691] dsl_pool_
[ 726.055713] dmu_objset_
[ 726.055740] zfs_ioc_
[ 726.055766] zfsdev_
[ 726.055771] do_vfs_
[ 726.055774] ? handle_
[ 726.055776] ? __do_page_
[ 726.055777] SyS_ioctl+0x79/0x90
[ 726.055779] do_syscall_
[ 726.055781] entry_SYSCALL_
[ 726.055782] RIP: 0033:0x7fd4adc795d7
[ 726.055782] RSP: 002b:00007ffe35
[ 726.055783] RAX: ffffffffffffffda RBX: 00007ffe356b8740 RCX: 00007fd4adc795d7
[ 726.055784] RDX: 00007ffe356b8740 RSI: 0000000000005a12 RDI: 0000000000000003
[ 726.055785] RBP: 000055632e278660 R08: 000000000000ffff R09: 00007fd4adcd1ed0
[ 726.055785] R10: 2f746c7561666564 R11: 0000000000000246 R12: 000055632e278660
[ 726.055788] R13: 00007ffe356beec0 R14: 00007fd4af1756e0 R15: 00007ffe356bbe30
[ 726.055791] INFO: task zfs:100584 blocked for more than 120 seconds.
[ 726.062170] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 726.068979] "echo 0 > /proc/sys/
[ 726.076835] zfs D 0 100584 15513 0x00000000
[ 726.076837] Call Trace:
[ 726.076840] __schedule+
[ 726.076845] schedule+0x2c/0x80
[ 726.076851] cv_wait_
[ 726.076854] ? wait_woken+
[ 726.076859] __cv_wait+0x15/0x20 [spl]
[ 726.076881] rrw_enter_
[ 726.076905] rrw_enter+0x1c/0x20 [zfs]
[ 726.076929] dsl_pool_
[ 726.076951] dmu_objset_
[ 726.076977] zfs_ioc_
[ 726.077001] zfsdev_
[ 726.077005] do_vfs_
[ 726.077006] ? handle_
[ 726.077008] ? __do_page_
[ 726.077010] SyS_ioctl+0x79/0x90
[ 726.077011] do_syscall_
[ 726.077013] entry_SYSCALL_
[ 726.077014] RIP: 0033:0x7fc2734075d7
[ 726.077014] RSP: 002b:00007fff65
[ 726.077015] RAX: ffffffffffffffda RBX: 00007fff653a4b30 RCX: 00007fc2734075d7
[ 726.077016] RDX: 00007fff653a4b30 RSI: 0000000000005a12 RDI: 0000000000000003
[ 726.077017] RBP: 000055f3576e9660 R08: 000000000000ffff R09: 00007fc27345fed0
[ 726.077017] R10: 2f746c7561666564 R11: 0000000000000246 R12: 000055f3576e9660
[ 726.077018] R13: 00007fff653aaec0 R14: 00007fc2749036e0 R15: 00007fff653a8220
[ 846.801124] INFO: task lxd:4455 blocked for more than 120 seconds.
[ 846.807352] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 846.814170] "echo 0 > /proc/sys/
[ 846.822028] lxd D 0 4455 1 0x00000000
[ 846.822031] Call Trace:
[ 846.822042] __schedule+
[ 846.822045] schedule+0x2c/0x80
[ 846.822047] rwsem_down_
[ 846.822078] ? dbuf_rele_
[ 846.822083] call_rwsem_
[ 846.822086] ? call_rwsem_
[ 846.822087] down_write+
[ 846.822091] grab_super+
[ 846.822118] ? zpl_create+
[ 846.822121] sget_userns+
[ 846.822123] ? get_anon_
[ 846.822150] ? zpl_create+
[ 846.822153] sget+0x7d/0xa0
[ 846.822156] ? get_anon_
[ 846.822181] zpl_mount+
[ 846.822183] mount_fs+0x37/0x150
[ 846.822188] vfs_kern_
[ 846.822189] do_mount+
[ 846.822190] ? copy_mount_
[ 846.822192] SyS_mount+0x98/0xe0
[ 846.822195] do_syscall_
[ 846.822196] entry_SYSCALL_
[ 846.822198] RIP: 0033:0x4dbd5a
[ 846.822199] RSP: 002b:000000c428
[ 846.822201] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004dbd5a
[ 846.822203] RDX: 000000c421a04b7c RSI: 000000c426f94f40 RDI: 000000c4274ceaa0
[ 846.822205] RBP: 000000c428be6930 R08: 000000c425521a90 R09: 0000000000000000
[ 846.822206] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
[ 846.822206] R13: 000000000000003e R14: 000000000000003d R15: 0000000000000080
[ 846.822239] INFO: task txg_sync:4202 blocked for more than 120 seconds.
[ 846.828882] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 846.835692] "echo 0 > /proc/sys/
[ 846.843549] txg_sync D 0 4202 2 0x80000000
[ 846.843551] Call Trace:
[ 846.843554] __schedule+
[ 846.843560] schedule+0x2c/0x80
[ 846.843566] cv_wait_
[ 846.843570] ? wait_woken+
[ 846.843574] __cv_wait+0x15/0x20 [spl]
[ 846.843603] rrw_enter_
[ 846.843629] rrw_enter+0x13/0x20 [zfs]
[ 846.843654] spa_sync+
[ 846.843682] txg_sync_
[ 846.843708] ? txg_quiesce_
[ 846.843713] thread_
[ 846.843717] kthread+0x121/0x140
[ 846.843720] ? __thread_
[ 846.843721] ? kthread_
[ 846.843723] ret_from_
[ 846.843931] INFO: task lxd:99919 blocked for more than 120 seconds.
[ 846.850227] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 846.857040] "echo 0 > /proc/sys/
[ 846.864892] lxd D 0 99919 99626 0x00000000
[ 846.864894] Call Trace:
[ 846.864897] __schedule+
[ 846.864903] ? __wake_
[ 846.864906] schedule+0x2c/0x80
[ 846.864910] cv_wait_
[ 846.864912] ? wait_woken+
[ 846.864917] __cv_wait+0x15/0x20 [spl]
[ 846.864942] txg_wait_
[ 846.864971] zil_create+
[ 846.864998] zil_commit_
[ 846.865023] zil_commit.
[ 846.865051] zil_commit+
[ 846.865080] zfs_sync+0x6e/0xb0 [zfs]
[ 846.865107] zpl_sync_
[ 846.865111] __sync_
[ 846.865113] sync_filesystem
[ 846.865114] generic_
[ 846.865116] kill_anon_
[ 846.865138] zpl_kill_
[ 846.865140] deactivate_
[ 846.865143] deactivate_
[ 846.865145] cleanup_
[ 846.865147] __cleanup_
[ 846.865148] task_work_
[ 846.865150] exit_to_
[ 846.865152] do_syscall_
[ 846.865153] entry_SYSCALL_
[ 846.865154] RIP: 0033:0x7fb553b3e8c7
[ 846.865155] RSP: 002b:00007fff42
[ 846.865156] RAX: 0000000000000000 RBX: 000000000000000f RCX: 00007fb553b3e8c7
[ 846.865158] RDX: 00007fb55476eb9f RSI: 0000000000000002 RDI: 00007fb554770b6b
[ 846.865159] RBP: 000000000000000c R08: 0000000000000000 R09: 00007fb553b8ae67
[ 846.865162] R10: 0000000000084000 R11: 0000000000000246 R12: 00007fff426c64a0
[ 846.865162] R13: 0000000003176690 R14: 0000000003177810 R15: 00000000031741f0
[ 846.865167] INFO: task zfs:100513 blocked for more than 120 seconds.
[ 846.871546] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 846.878357] "echo 0 > /proc/sys/
[ 846.886214] zfs D 0 100513 2626 0x00000000
[ 846.886215] Call Trace:
[ 846.886218] __schedule+
[ 846.886223] schedule+0x2c/0x80
[ 846.886230] cv_wait_
[ 846.886232] ? wait_woken+
[ 846.886237] __cv_wait+0x15/0x20 [spl]
[ 846.886261] rrw_enter_
[ 846.886284] rrw_enter+0x1c/0x20 [zfs]
[ 846.886309] dsl_pool_
[ 846.886331] dmu_objset_
[ 846.886357] zfs_ioc_
[ 846.886383] zfsdev_
[ 846.886388] do_vfs_
[ 846.886391] ? handle_
[ 846.886394] ? __do_page_
[ 846.886396] SyS_ioctl+0x79/0x90
[ 846.886397] do_syscall_
[ 846.886399] entry_SYSCALL_
[ 846.886400] RIP: 0033:0x7fd4adc795d7
[ 846.886402] RSP: 002b:00007ffe35
[ 846.886405] RAX: ffffffffffffffda RBX: 00007ffe356b8740 RCX: 00007fd4adc795d7
[ 846.886406] RDX: 00007ffe356b8740 RSI: 0000000000005a12 RDI: 0000000000000003
[ 846.886406] RBP: 000055632e278660 R08: 000000000000ffff R09: 00007fd4adcd1ed0
[ 846.886407] R10: 2f746c7561666564 R11: 0000000000000246 R12: 000055632e278660
[ 846.886408] R13: 00007ffe356beec0 R14: 00007fd4af1756e0 R15: 00007ffe356bbe30
[ 846.886410] INFO: task zfs:100584 blocked for more than 120 seconds.
[ 846.892790] Tainted: P O 4.15.0-20-generic #21-Ubuntu
[ 846.899598] "echo 0 > /proc/sys/
[ 846.907454] zfs D 0 100584 15513 0x00000000
[ 846.907456] Call Trace:
[ 846.907459] __schedule+
[ 846.907465] schedule+0x2c/0x80
[ 846.907470] cv_wait_
[ 846.907471] ? wait_woken+
[ 846.907477] __cv_wait+0x15/0x20 [spl]
[ 846.907499] rrw_enter_
[ 846.907523] rrw_enter+0x1c/0x20 [zfs]
[ 846.907547] dsl_pool_
[ 846.907568] dmu_objset_
[ 846.907594] zfs_ioc_
[ 846.907618] zfsdev_
[ 846.907622] do_vfs_
[ 846.907624] ? handle_
[ 846.907625] ? __do_page_
[ 846.907627] SyS_ioctl+0x79/0x90
[ 846.907628] do_syscall_
[ 846.907630] entry_SYSCALL_
[ 846.907631] RIP: 0033:0x7fc2734075d7
[ 846.907631] RSP: 002b:00007fff65
[ 846.907633] RAX: ffffffffffffffda RBX: 00007fff653a4b30 RCX: 00007fc2734075d7
[ 846.907633] RDX: 00007fff653a4b30 RSI: 0000000000005a12 RDI: 0000000000000003
[ 846.907634] RBP: 000055f3576e9660 R08: 000000000000ffff R09: 00007fc27345fed0
[ 846.907635] R10: 2f746c7561666564 R11: 0000000000000246 R12: 000055f3576e9660
[ 846.907637] R13: 00007fff653aaec0 R14: 00007fc2749036e0 R15: 00007fff653a8220
Contents of "/proc/
13 1 0x01 96 4608 11300672527 1808059980062
name type data
hits 4 44186496
misses 4 1247761
demand_data_hits 4 3327097
demand_data_misses 4 17953
demand_
demand_
prefetch_data_hits 4 1357
prefetch_
prefetch_
prefetch_
mru_hits 4 18193851
mru_ghost_hits 4 0
mfu_hits 4 24976976
mfu_ghost_hits 4 0
deleted 4 10
mutex_miss 4 0
access_skip 4 68
evict_skip 4 1
evict_not_enough 4 0
evict_l2_cached 4 0
evict_l2_eligible 4 101376
evict_l2_ineligible 4 2048
evict_l2_skip 4 0
hash_elements 4 38683
hash_elements_max 4 57741
hash_collisions 4 1520
hash_chains 4 5
hash_chain_max 4 1
p 4 16815604736
c 4 33631209472
c_min 4 2101950592
c_max 4 33631209472
size 4 833199872
compressed_size 4 286609408
uncompressed_size 4 831992320
overhead_size 4 330568192
hdr_size 4 14660144
data_size 4 193579520
metadata_size 4 423598080
dbuf_size 4 46065424
dnode_size 4 112384064
bonus_size 4 42912640
anon_size 4 36502016
anon_evictable_data 4 0
anon_evictable_
mru_size 4 310932480
mru_evictable_data 4 27623424
mru_evictable_
mru_ghost_size 4 0
mru_ghost_
mru_ghost_
mfu_size 4 269743104
mfu_evictable_data 4 53612032
mfu_evictable_
mfu_ghost_size 4 0
mfu_ghost_
mfu_ghost_
l2_hits 4 0
l2_misses 4 0
l2_feeds 4 0
l2_rw_clash 4 0
l2_read_bytes 4 0
l2_write_bytes 4 0
l2_writes_sent 4 0
l2_writes_done 4 0
l2_writes_error 4 0
l2_writes_
l2_evict_lock_retry 4 0
l2_evict_reading 4 0
l2_evict_l1cached 4 0
l2_free_on_write 4 0
l2_abort_lowmem 4 0
l2_cksum_bad 4 0
l2_io_error 4 0
l2_size 4 0
l2_asize 4 0
l2_hdr_size 4 0
memory_
memory_direct_count 4 0
memory_
memory_all_bytes 4 67262418944
memory_free_bytes 4 57157578752
memory_
arc_no_grow 4 0
arc_tempreserve 4 0
arc_loaned_bytes 4 0
arc_prune 4 0
arc_meta_used 4 639620352
arc_meta_limit 4 25223407104
arc_dnode_limit 4 2522340710
arc_meta_max 4 965175896
arc_meta_min 4 16777216
sync_wait_for_async 4 168
demand_
arc_need_free 4 0
arc_sys_free 4 1050975296
Command "slabtop -o"
Active / Total Objects (% used) : 28354235 / 29140626 (97.3%)
Active / Total Slabs (% used) : 382017 / 382017 (100.0%)
Active / Total Caches (% used) : 95 / 128 (74.2%)
Active / Total Size (% used) : 4580408.56K / 4743868.00K (96.6%)
Minimum / Average / Maximum Object : 0.01K / 0.16K / 21.81K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
17206144 17204917 0% 0.03K 134423 128 537692K kmalloc-32
1141392 966790 0% 0.19K 27176 42 217408K dentry
1114496 1004537 0% 0.06K 17414 64 69656K kmalloc-64
1088192 1078186 0% 0.50K 17003 64 544096K kmalloc-512
1038300 714247 0% 0.13K 17305 60 138440K kernfs_node_cache
937536 931296 0% 0.25K 14649 64 234384K filp
684160 682244 0% 0.06K 10690 64 42760K pid
582099 569877 0% 0.59K 10983 53 351456K inode_cache
520104 518529 0% 0.20K 13336 39 106688K vm_area_struct
404334 388371 0% 0.09K 9627 42 38508K kmalloc-96
342286 341596 0% 0.09K 7441 46 29764K anon_vma
342016 338948 0% 0.25K 5344 64 85504K kmalloc-256
277032 276822 0% 0.19K 6596 42 52768K cred_jar
248352 241634 0% 0.66K 5174 48 165568K proc_inode_cache
248320 233052 0% 0.01K 485 512 1940K kmalloc-8
214984 143177 0% 0.57K 3839 56 122848K radix_tree_node
Colin Ian King (colin-king) wrote : | #1 |
Changed in zfs-linux (Ubuntu): | |
status: | New → In Progress |
Changed in linux (Ubuntu): | |
status: | New → In Progress |
importance: | Undecided → High |
Changed in zfs-linux (Ubuntu): | |
importance: | Undecided → High |
Changed in linux (Ubuntu): | |
assignee: | nobody → Colin Ian King (colin-king) |
Changed in zfs-linux (Ubuntu): | |
assignee: | nobody → Colin Ian King (colin-king) |
Changed in linux: | |
status: | Unknown → Fix Released |
description: | updated |
Launchpad Janitor (janitor) wrote : | #2 |
This bug was fixed in the package zfs-linux - 0.7.9-3ubuntu4
---------------
zfs-linux (0.7.9-3ubuntu4) cosmic; urgency=medium
* Fix zpl_mount() deadlock (LP: #1781364)
- Upstream ZFS fix ac09630d8b0b ("Fix zpl_mount() deadlock")
fixes deadlock on multiple parallelized mount/umounts
-- Colin Ian King <email address hidden> Thu, 12 Jul 2018 09:18:24 +0100
Changed in zfs-linux (Ubuntu Cosmic): | |
status: | In Progress → Fix Released |
Changed in linux (Ubuntu Cosmic): | |
status: | In Progress → Fix Committed |
description: | updated |
Launchpad Janitor (janitor) wrote : | #3 |
Status changed to 'Confirmed' because the bug affects multiple users.
Changed in linux (Ubuntu Bionic): | |
status: | New → Confirmed |
Changed in linux (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in zfs-linux (Ubuntu Bionic): | |
status: | New → Confirmed |
Changed in zfs-linux (Ubuntu Xenial): | |
status: | New → Confirmed |
Hello Colin, or anyone else affected,
Accepted zfs-linux into bionic-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Changed in zfs-linux (Ubuntu Bionic): | |
status: | Confirmed → Fix Committed |
Colin Ian King (colin-king) wrote : | #8 |
This fix will only work once it lands in the updated kernel as well as the user space packages, so please test once the updated kernel is also in -proposed.
Vasiliy (vvershkov) wrote : | #9 |
My experiments was here: https:/
I installed new 18.04 + zfs + lxd - zfs hangs (well, that would be awkward if would not)
+zfsutils-
+kernel=4.15.0.28 (all packages) - still hangs :( Am I missing something?
+everything_
+upgrade+
So this bug doesn't fixed for me... And it is strange, because I used to install zfs 0.7.9 from source with commit that fixed bug (https:/
Colin Ian King (colin-king) wrote : | #10 |
Vasiliy, please refer to comment #8
Colin Ian King (colin-king) wrote : | #11 |
The kernel driver fix will land in the next -proposed kernel as the Ubuntu ZFS driver comes bundled with the kernel.
If you build zfs from source, then that will build the kernel driver as a DKMS module with the fix in it and *that* will work.
One needs both the zfs userspace and the kernel for the entire bug fix.
Vasiliy (vvershkov) wrote : | #12 |
Ah, I thought that kernel (module) in proposed already patched. I understood comment #8 like "don't forget to upgrade kernel from proposed!", while it is "wait until fixed kernel in proposed".
Now it makes sense, thank you.
Simos Xenitellis (simosx) wrote : | #13 |
zfsutils-linux (zfs-linux, zfs-linux_
https:/
Please report when linux-image gets into -proposed,
https:/
Vasiliy (vvershkov) wrote : | #14 |
It seems like 4.15.0-29 (uploaded right now to proposed) is not the kernel we are looking for.
Simos Xenitellis (simosx) wrote : | #15 |
@Vasiliy: Indeed.
The version in -proposed is "4.15.0-29.31" (source: https:/
The page for that version at https:/
does not have a reference to this bug number #1781364.
Colin Ian King (colin-king) wrote : | #16 |
The bug will be automatically updated when the -proposed kernel containing the fix is ready, please wait for that message.
Changed in linux (Ubuntu Bionic): | |
status: | Confirmed → Fix Committed |
status: | Fix Committed → Confirmed |
Changed in linux (Ubuntu Xenial): | |
status: | Confirmed → Fix Committed |
Changed in linux (Ubuntu Bionic): | |
status: | Confirmed → Fix Committed |
Changed in linux (Ubuntu Xenial): | |
assignee: | nobody → tenox (senseimyijaki) |
Changed in linux (Ubuntu Bionic): | |
assignee: | nobody → tenox (senseimyijaki) |
Launchpad Janitor (janitor) wrote : | #17 |
This bug was fixed in the package linux - 4.17.0-6.7
---------------
linux (4.17.0-6.7) cosmic; urgency=medium
* linux: 4.17.0-6.7 -proposed tracker (LP: #1783396)
* [Regression] EXT4-fs error (device sda2): ext4_validate_
comm stress-ng: bg 4705: bad block bitmap checksum (LP: #1781709)
- SAUCE: Revert "UBUNTU: SAUCE: ext4: fix ext4_validate_
stress-ng: Corrupt inode bitmap"
- SAUCE: ext4: check for allocation block validity with block group locked
* Cosmic update to 4.17.9 stable release (LP: #1783201)
- userfaultfd: hugetlbfs: fix userfaultfd_
- mm: hugetlb: yield when prepping struct pages
- mm: teach dump_page() to correctly output poisoned struct pages
- PCI / ACPI / PM: Resume bridges w/o drivers on suspend-to-RAM
- ACPICA: Drop leading newlines from error messages
- ACPI / battery: Safe unregistering of hooks
- drm/amdgpu: Make struct amdgpu_atif private to amdgpu_acpi.c
- tracing: Avoid string overflow
- tracing: Fix missing return symbol in function_graph output
- scsi: sg: mitigate read/write abuse
- scsi: aacraid: Fix PD performance regression over incorrect qd being set
- scsi: target: Fix truncated PR-in ReadKeys response
- s390: Correct register corruption in critical section cleanup
- drbd: fix access after free
- vfio: Use get_user_
- ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl
- ARM: dts: omap3: Fix am3517 mdio and emac clock references
- ARM: dts: dra7: Disable metastability workaround for USB2
- cifs: Fix use after free of a mid_q_entry
- cifs: Fix memory leak in smb2_set_ea()
- cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
- cifs: Fix infinite loop when using hard mount option
- drm: Use kvzalloc for allocating blob property memory
- drm/udl: fix display corruption of the last line
- drm/amdgpu: Add amdgpu_
- drm/amdgpu: Dynamically probe for ATIF handle (v2)
- jbd2: don't mark block as modified if the handle is out of credits
- ext4: add corruption check in ext4_xattr_
- ext4: always verify the magic number in xattr blocks
- ext4: make sure bitmaps and the inode table don't overlap with bg
descriptors
- ext4: always check block group bounds in ext4_init_
- ext4: only look at the bg_flags field if it is valid
- ext4: verify the depth of extent tree in ext4_find_extent()
- ext4: include the illegal physical block in the bad map ext4_error msg
- ext4: clear i_data in ext4_inode_info when removing inline data
- ext4: never move the system.data xattr out of the inode body
- ext4: avoid running out of journal credits when appending to an inline file
- ext4: add more inode number paranoia checks
- ext4: add more mount time checks of the superblock
- ext4: check superblock mapped prior to committing
- HID: i2c-hid: Fix "incomplete report" noise
- HID: hiddev: fix potential Spectre v1
- HID: debug: check length before copy_to_user()
- HID: core: allow concurrent registr...
Changed in linux (Ubuntu Cosmic): | |
status: | Fix Committed → Fix Released |
Brad Figg (brad-figg) wrote : | #18 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-xenial |
Brad Figg (brad-figg) wrote : | #19 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-bionic |
Vasiliy (vvershkov) wrote : | #20 |
I can confirm now that fix works for me.
Do somebody knows when this kernel and zfsutils package will move from proposed to updates?
Colin Ian King (colin-king) wrote : | #21 |
Verified passed for Ubuntu Bionic using the reproducer described in comment #1. Marking as verified.
tags: |
added: verification-done-bionic removed: verification-needed-bionic |
Colin Ian King (colin-king) wrote : | #22 |
Verified passed for Ubuntu Xenial using the reproducer described in comment #1. Marking as verified.
tags: |
added: verification-done-xenial removed: verification-needed-xenial |
Colin Ian King (colin-king) wrote : | #23 |
@Vasiliy, hopefully by early next week.
The verification of the Stable Release Update for zfs-linux has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #25 |
This bug was fixed in the package zfs-linux - 0.7.5-1ubuntu16.3
---------------
zfs-linux (0.7.5-1ubuntu16.3) bionic; urgency=medium
* Fix zpl_mount() deadlock (LP: #1781364)
- Upstream ZFS fix ac09630d8b0b ("Fix zpl_mount() deadlock")
fixes deadlock on multiple parallelized mount/umounts
-- Colin Ian King <email address hidden> Thu, 12 Jul 2018 09:18:24 +0100
Changed in zfs-linux (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Sam Van den Eynde (samvde) wrote : | #26 |
I don't see the confirmation for Bionic in this bug report. Any update when the 4.17 kernel lands in bionic-proposed? Or do I need another kernel version for Bionic? What do I need exactly for my Bionic server?
This bug prevents me from updating my lxd containers, it will hang the system consistently.
Colin Ian King (colin-king) wrote : | #27 |
The bug will be fixed once the zfs package and the bionic kernel (that contains the zfs driver changes) will be released. So far, just the zfs package has been released and we are waiting for the kernel to complete the SRU update and verification phase - this takes a bit longer as the kernel contains a lot more other changes and we have to do more exhaustive testing.
Launchpad Janitor (janitor) wrote : | #28 |
This bug was fixed in the package linux - 4.15.0-33.36
---------------
linux (4.15.0-33.36) bionic; urgency=medium
* linux: 4.15.0-33.36 -proposed tracker (LP: #1787149)
* RTNL assertion failure on ipvlan (LP: #1776927)
- ipvlan: drop ipv6 dependency
- ipvlan: use per device spinlock to protect addrs list updates
- SAUCE: fix warning from "ipvlan: drop ipv6 dependency"
* ubuntu_bpf_jit test failed on Bionic s390x systems (LP: #1753941)
- test_bpf: flag tests that cannot be jited on s390
* HDMI/DP audio can't work on the laptop of Dell Latitude 5495 (LP: #1782689)
- drm/nouveau: fix nouveau_
- drm/radeon: fix radeon_
- drm/amdgpu: fix amdgpu_
- platform/x86: apple-gmux: fix gmux_get_
- ALSA: hda: use PCI_BASE_
- vga_switcheroo: set audio client id according to bound GPU id
* locking sockets broken due to missing AppArmor socket mediation patches
(LP: #1780227)
- UBUNTU SAUCE: apparmor: fix apparmor mediating locking non-fs, unix sockets
* Update2 for ocxl driver (LP: #1781436)
- ocxl: Fix page fault handler in case of fault on dying process
* netns: unable to follow an interface that moves to another netns
(LP: #1774225)
- net: core: Expose number of link up/down transitions
- dev: always advertise the new nsid when the netns iface changes
- dev: advertise the new ifindex when the netns iface changes
* [Bionic] Disk IO hangs when using BFQ as io scheduler (LP: #1780066)
- block, bfq: fix occurrences of request finish method's old name
- block, bfq: remove batches of confusing ifdefs
- block, bfq: add requeue-request hook
* HP ProBook 455 G5 needs mute-led-gpio fixup (LP: #1781763)
- ALSA: hda: add mute led support for HP ProBook 455 G5
* [Bionic] bug fixes to improve stability of the ThunderX2 i2c driver
(LP: #1781476)
- i2c: xlp9xx: Fix issue seen when updating receive length
- i2c: xlp9xx: Make sure the transfer size is not more than
I2C_
* x86/kvm: fix LAPIC timer drift when guest uses periodic mode (LP: #1778486)
- x86/kvm: fix LAPIC timer drift when guest uses periodic mode
* Please include ax88179_178a and r8152 modules in d-i udeb (LP: #1771823)
- [Config:] d-i: Add ax88179_178a and r8152 to nic-modules
* Nvidia fails after switching its mode (LP: #1778658)
- PCI: Restore config space on runtime resume despite being unbound
* Kernel error "task zfs:pid blocked for more than 120 seconds" (LP: #1781364)
- SAUCE: (noup) zfs to 0.7.5-1ubuntu16.3
* CVE-2018-12232
- PATCH 1/1] socket: close race condition between sock_close() and
sockfs_
* CVE-2018-10323
- xfs: set format back to extents if xfs_bmap_
* change front mic location for more lenovo m7/8/9xx machines (LP: #1781316)
- ALSA: hda/realtek - Fix the problem of two front mics on more machines
- ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
* Cephfs + fscache: unab...
Changed in linux (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #29 |
This bug was fixed in the package linux - 4.4.0-134.160
---------------
linux (4.4.0-134.160) xenial; urgency=medium
* linux: 4.4.0-134.160 -proposed tracker (LP: #1787177)
* locking sockets broken due to missing AppArmor socket mediation patches
(LP: #1780227)
- UBUNTU SAUCE: apparmor: fix apparmor mediating locking non-fs, unix sockets
* Backport namespaced fscaps to xenial 4.4 (LP: #1778286)
- Introduce v3 namespaced file capabilities
- commoncap: move assignment of fs_ns to avoid null pointer dereference
- capabilities: fix buffer overread on very short xattr
- commoncap: Handle memory allocation failure.
* Xenial update to 4.4.140 stable release (LP: #1784409)
- usb: cdc_acm: Add quirk for Uniden UBC125 scanner
- USB: serial: cp210x: add CESINEL device ids
- USB: serial: cp210x: add Silicon Labs IDs for Windows Update
- n_tty: Fix stall at n_tty_receive_
- staging: android: ion: Return an ERR_PTR in ion_map_kernel
- n_tty: Access echo_* variables carefully.
- x86/boot: Fix early command-line parsing when matching at end
- ath10k: fix rfc1042 header retrieval in QCA4019 with eth decap mode
- i2c: rcar: fix resume by always initializing registers before transfer
- ipv4: Fix error return value in fib_convert_
- kprobes/x86: Do not modify singlestep buffer while resuming
- nvme-pci: initialize queue memory before interrupts
- netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain()
- ARM: dts: imx6q: Use correct SDMA script for SPI5 core
- ubi: fastmap: Correctly handle interrupted erasures in EBA
- mm: hugetlb: yield when prepping struct pages
- tracing: Fix missing return symbol in function_graph output
- scsi: sg: mitigate read/write abuse
- s390: Correct register corruption in critical section cleanup
- drbd: fix access after free
- cifs: Fix infinite loop when using hard mount option
- jbd2: don't mark block as modified if the handle is out of credits
- ext4: make sure bitmaps and the inode table don't overlap with bg
descriptors
- ext4: always check block group bounds in ext4_init_
- ext4: only look at the bg_flags field if it is valid
- ext4: verify the depth of extent tree in ext4_find_extent()
- ext4: include the illegal physical block in the bad map ext4_error msg
- ext4: clear i_data in ext4_inode_info when removing inline data
- ext4: add more inode number paranoia checks
- ext4: add more mount time checks of the superblock
- ext4: check superblock mapped prior to committing
- HID: i2c-hid: Fix "incomplete report" noise
- HID: hiddev: fix potential Spectre v1
- HID: debug: check length before copy_to_user()
- x86/mce: Detect local MCEs properly
- x86/mce: Fix incorrect "Machine check from unknown source" message
- media: cx25840: Use subdev host data for PLL override
- mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
- dm bufio: avoid sleeping while holding the dm_bufio lock
- dm bufio: drop the lock when doing GFP_NOIO allocation
- mtd: rawnand: mxc: set spa...
Changed in linux (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
tags: |
added: verification-required-xenial removed: verification-done-xenial |
Colin Ian King (colin-king) wrote : | #30 |
Colin Ian King (colin-king) wrote : | #31 |
Wrong URL, ignore that.
Colin Ian King (colin-king) wrote : | #32 |
Hello Colin, or anyone else affected,
Accepted zfs-linux into xenial-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in zfs-linux (Ubuntu Xenial): | |
status: | Confirmed → Fix Committed |
Colin Ian King (colin-king) wrote : | #34 |
Tested 0.6.5.6-0ubuntu25 and it works without any issues, so marking this as verified.
tags: |
added: verification-done-xenial removed: verification-required-xenial |
Changed in linux (Ubuntu Xenial): | |
assignee: | tenox (senseimyijaki) → Colin Ian King (colin-king) |
Changed in linux (Ubuntu Bionic): | |
assignee: | tenox (senseimyijaki) → Colin Ian King (colin-king) |
Changed in zfs-linux (Ubuntu Bionic): | |
assignee: | nobody → Colin Ian King (colin-king) |
Changed in zfs-linux (Ubuntu Xenial): | |
assignee: | nobody → Colin Ian King (colin-king) |
Changed in zfs-linux (Ubuntu Bionic): | |
importance: | Undecided → High |
Changed in zfs-linux (Ubuntu Xenial): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Xenial): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Bionic): | |
importance: | Undecided → High |
Launchpad Janitor (janitor) wrote : | #35 |
This bug was fixed in the package zfs-linux - 0.6.5.6-0ubuntu25
---------------
zfs-linux (0.6.5.6-0ubuntu25) xenial; urgency=medium
* Fix zpl_mount() deadlock (LP: #1781364)
- Upstream ZFS fix ac09630d8b0b ("Fix zpl_mount() deadlock")
fixes deadlock on multiple parallelized mount/umounts
-- Colin Ian King <email address hidden> Thu, 12 Jul 2018 09:18:24 +0100
Changed in zfs-linux (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
tags: | added: cscc |
Upstream ZFS fix:
commit ac09630d8b0bf6c 92084a30fdaefd0 3fd0adbdc1
Author: Brian Behlendorf <email address hidden>
Date: Wed Jul 11 15:49:10 2018 -0700
Fix zpl_mount() deadlock