Comment 4 for bug 1931660

Revision history for this message
mhosken (martin-hosken) wrote :

Downgrading these libraries did not fix it for me.

One canary is to look at the state of the updatedb.mlocate process which goes into live hang as it hits the file that is currently faulty. Thus for example:

   2728 ? DNs 0:01 /usr/bin/updatedb.mlocate

Using lsof one can find out which is the faulty file but, if you try to ls the directory with the file in, then you terminal live hangs. And if you try to delete the directory containing the faulty file, then that silently fails, leaving the directory and the live hang should you try to read it.

It's said that this bug relates to 1906476 and via that to https://github.com/openzfs/zfs/issues/11474 but I think I have that one fixed with zfs_recover=1. But the syslog message is very similar:

Aug 11 10:13:45 silmh9 kernel: [ 242.531205] INFO: task updatedb.mlocat:2728 blocked for more than 120 seconds.
Aug 11 10:13:45 silmh9 kernel: [ 242.531211] Tainted: P U O 5.13.0-13-generic #13-Ubuntu
Aug 11 10:13:45 silmh9 kernel: [ 242.531212] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 11 10:13:45 silmh9 kernel: [ 242.531213] task:updatedb.mlocat state:D stack: 0 pid: 2728 ppid: 1 flags:0x00004220
Aug 11 10:13:45 silmh9 kernel: [ 242.531216] Call Trace:
Aug 11 10:13:45 silmh9 kernel: [ 242.531221] __schedule+0x268/0x680
Aug 11 10:13:45 silmh9 kernel: [ 242.531225] ? arch_local_irq_enable+0xb/0xd
Aug 11 10:13:45 silmh9 kernel: [ 242.531228] schedule+0x4f/0xc0
Aug 11 10:13:45 silmh9 kernel: [ 242.531231] spl_panic+0xfa/0xfc [spl]
Aug 11 10:13:45 silmh9 kernel: [ 242.531238] ? queued_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531287] ? do_raw_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531313] ? __raw_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531339] ? dmu_buf_replace_user+0x65/0x80 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531367] ? dmu_buf_set_user+0x13/0x20 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531393] ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531420] zfs_znode_sa_init+0xd9/0xe0 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531479] zfs_znode_alloc+0x101/0x560 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531536] ? dmu_buf_unlock_parent+0x5d/0x90 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531564] ? do_raw_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531590] ? dbuf_read_impl.constprop.0+0x316/0x3e0 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531616] ? __cond_resched+0x1a/0x50
Aug 11 10:13:45 silmh9 kernel: [ 242.531618] ? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
Aug 11 10:13:45 silmh9 kernel: [ 242.531620] ? queued_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531642] ? __cond_resched+0x1a/0x50
Aug 11 10:13:45 silmh9 kernel: [ 242.531644] ? down_read+0x13/0x90
Aug 11 10:13:45 silmh9 kernel: [ 242.531645] ? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
Aug 11 10:13:45 silmh9 kernel: [ 242.531646] ? queued_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531680] ? do_raw_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531712] ? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
Aug 11 10:13:45 silmh9 kernel: [ 242.531713] ? dmu_object_info_from_dnode+0x8e/0xa0 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531742] zfs_zget+0x237/0x280 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531796] zfs_dirent_lock+0x42a/0x570 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531850] zfs_dirlook+0x91/0x2a0 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531902] zfs_lookup+0x1fb/0x3f0 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.531954] zpl_lookup+0xcb/0x230 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.532007] __lookup_slow+0x84/0x150
Aug 11 10:13:45 silmh9 kernel: [ 242.532009] walk_component+0x141/0x1b0
Aug 11 10:13:45 silmh9 kernel: [ 242.532010] ? dput+0x5c/0x180
Aug 11 10:13:45 silmh9 kernel: [ 242.532012] path_lookupat+0x6e/0x1c0
Aug 11 10:13:45 silmh9 kernel: [ 242.532013] filename_lookup+0xbb/0x1c0
Aug 11 10:13:45 silmh9 kernel: [ 242.532015] ? __check_object_size.part.0+0x128/0x150
Aug 11 10:13:45 silmh9 kernel: [ 242.532017] ? __check_object_size+0x1c/0x20
Aug 11 10:13:45 silmh9 kernel: [ 242.532018] ? strncpy_from_user+0x44/0x150
Aug 11 10:13:45 silmh9 kernel: [ 242.532020] ? __raw_spin_unlock+0x9/0x10 [zfs]
Aug 11 10:13:45 silmh9 kernel: [ 242.532066] user_path_at_empty+0x59/0x90
Aug 11 10:13:45 silmh9 kernel: [ 242.532068] vfs_statx+0x7a/0x120
Aug 11 10:13:45 silmh9 kernel: [ 242.532070] __do_sys_newlstat+0x3e/0x70
Aug 11 10:13:45 silmh9 kernel: [ 242.532072] ? __secure_computing+0x42/0xe0
Aug 11 10:13:45 silmh9 kernel: [ 242.532073] ? syscall_trace_enter.constprop.0+0xa0/0x1c0
Aug 11 10:13:45 silmh9 kernel: [ 242.532075] __x64_sys_newlstat+0x16/0x20
Aug 11 10:13:45 silmh9 kernel: [ 242.532076] do_syscall_64+0x61/0xb0
Aug 11 10:13:45 silmh9 kernel: [ 242.532078] ? exit_to_user_mode_prepare+0x37/0xb0
Aug 11 10:13:45 silmh9 kernel: [ 242.532079] ? syscall_exit_to_user_mode+0x27/0x50
Aug 11 10:13:45 silmh9 kernel: [ 242.532080] ? do_syscall_64+0x6e/0xb0
Aug 11 10:13:45 silmh9 kernel: [ 242.532081] ? do_syscall_64+0x6e/0xb0
Aug 11 10:13:45 silmh9 kernel: [ 242.532086] ? do_syscall_64+0x6e/0xb0
Aug 11 10:13:45 silmh9 kernel: [ 242.532087] entry_SYSCALL_64_after_hwframe+0x44/0xae
Aug 11 10:13:45 silmh9 kernel: [ 242.532088] RIP: 0033:0x7fbfb3cd97ba
Aug 11 10:13:45 silmh9 kernel: [ 242.532090] RSP: 002b:00007ffc377eead8 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
Aug 11 10:13:45 silmh9 kernel: [ 242.532091] RAX: ffffffffffffffda RBX: 000055f3abe4af70 RCX: 00007fbfb3cd97ba
Aug 11 10:13:45 silmh9 kernel: [ 242.532092] RDX: 00007ffc377eeb50 RSI: 00007ffc377eeb50 RDI: 000055f3abe68499
Aug 11 10:13:45 silmh9 kernel: [ 242.532092] RBP: 000055f3abe68499 R08: 0000000000000001 R09: 000055f3ab979740
Aug 11 10:13:45 silmh9 kernel: [ 242.532093] R10: 000055f3ab979580 R11: 0000000000000246 R12: 00007ffc377eed20
Aug 11 10:13:45 silmh9 kernel: [ 242.532093] R13: 0000000000000000 R14: 00007ffc377eed30 R15: 0000000000000000