Using latest zfs 0.8.3-1ubuntu12.4 on latest Ubuntu 20.04.1, I observe a rare zfs panics that seem to be workload-specific which render a server mostly unresponsive besides ssh still working. Attempting to reboot the server in this state makes the shutdown hang forever.
Using latest zfs 0.8.3-1ubuntu12.4 on latest Ubuntu 20.04.1, I observe a rare zfs panics that seem to be workload-specific which render a server mostly unresponsive besides ssh still working. Attempting to reboot the server in this state makes the shutdown hang forever.
You may want to consider backporting the fix released in zfs 0.8.4 into 20.04: https:/ /github. com/openzfs/ zfs/pull/ 10148
Log sample of panic: 0x6d/0x9a 0x29/0x2b [spl] cold+0x60/ 0x99 [spl] 0x19/0x30 node+0x20e/ 0x330 alloc_impl+ 0xa8/0x100 [spl] add+0x17/ 0x40 [zfs] recover+ 0x6f/0x90 [zfs] tempreserve_ impl.isra. 0.constprop. 0+0xed/ 0x330 [zfs] hold_array_ by_dnode+ 0x3a0/0x490 [zfs] uio_dnode+ 0x4c/0x140 [zfs] uio_dbuf+ 0x4f/0x70 [zfs] 0xa1f/0xd40 [zfs] path+0x74/ 0xb0 to_asm+ 0x34/0x70 to_asm+ 0x40/0x70 to_asm+ 0x40/0x70 to_asm+ 0x34/0x70 to_asm+ 0x34/0x70 common_ iovec+0xad/ 0x120 [zfs] write+0x56/ 0x90 [zfs] write+0x84/ 0x1a0 0x8b/0x180 0x71/0x120 64+0x57/ 0x190 7fbde0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
```
Nov 17 16:06:15 hostname kernel: [3385134.716024] PANIC: zfs: accessing past end of object c1c/2db52f (size=17408 access=7492+16428)
Nov 17 16:06:15 hostname kernel: [3385134.716072] Showing stack for process 3166846
Nov 17 16:06:15 hostname kernel: [3385134.716074] CPU: 25 PID: 3166846 Comm: node Tainted: P O 5.4.0-48-generic #52-Ubuntu
Nov 17 16:06:15 hostname kernel: [3385134.716075] Hardware name: <hardware>
Nov 17 16:06:15 hostname kernel: [3385134.716076] Call Trace:
Nov 17 16:06:15 hostname kernel: [3385134.716085] dump_stack+
Nov 17 16:06:15 hostname kernel: [3385134.716097] spl_dumpstack+
Nov 17 16:06:15 hostname kernel: [3385134.716102] vcmn_err.
Nov 17 16:06:15 hostname kernel: [3385134.716106] ? _cond_resched+
Nov 17 16:06:15 hostname kernel: [3385134.716108] ? __kmalloc_
Nov 17 16:06:15 hostname kernel: [3385134.716113] ? spl_kmem_
Nov 17 16:06:15 hostname kernel: [3385134.716190] ? __list_
Nov 17 16:06:15 hostname kernel: [3385134.716235] zfs_panic_
Nov 17 16:06:15 hostname kernel: [3385134.716272] ? dsl_dir_
Nov 17 16:06:15 hostname kernel: [3385134.716305] dmu_buf_
Nov 17 16:06:15 hostname kernel: [3385134.716338] dmu_write_
Nov 17 16:06:15 hostname kernel: [3385134.716370] dmu_write_
Nov 17 16:06:15 hostname kernel: [3385134.716416] zfs_write+
Nov 17 16:06:15 hostname kernel: [3385134.716419] ? d_absolute_
Nov 17 16:06:15 hostname kernel: [3385134.716421] ? __switch_
Nov 17 16:06:15 hostname kernel: [3385134.716423] ? __switch_
Nov 17 16:06:15 hostname kernel: [3385134.716424] ? __switch_
Nov 17 16:06:15 hostname kernel: [3385134.716425] ? __switch_
Nov 17 16:06:15 hostname kernel: [3385134.716427] ? __switch_
Nov 17 16:06:15 hostname kernel: [3385134.716474] zpl_write_
Nov 17 16:06:15 hostname kernel: [3385134.716567] zpl_iter_
Nov 17 16:06:15 hostname kernel: [3385134.716570] do_iter_
Nov 17 16:06:15 hostname kernel: [3385134.716574] ? futex_wake+
Nov 17 16:06:15 hostname kernel: [3385134.716577] do_writev+
Nov 17 16:06:15 hostname kernel: [3385134.716581] do_syscall_
Nov 17 16:06:15 hostname kernel: [3385134.716584] RIP: 0033:0x7fa366eee0cd
Nov 17 16:06:15 hostname kernel: [3385134.716587] RSP: 002b:00007fa35e
Nov 17 16:06:15 hostname kernel: [3385134.716590] RDX: 000000000000000c RSI: 000000000651c7b0 RDI: 000000000000001d
```