2020-11-17 15:45:34 |
silverwind |
bug |
|
|
added bug |
2020-11-17 15:53:40 |
silverwind |
summary |
zfs: accessing past end of object in 0.8.3-1ubuntu12.4 |
zfs PANIC: accessing past end of object in 0.8.3-1ubuntu12.4 |
|
2020-11-18 00:25:15 |
Colin Ian King |
zfs-linux (Ubuntu): status |
New |
In Progress |
|
2020-11-18 00:25:17 |
Colin Ian King |
zfs-linux (Ubuntu): importance |
Undecided |
Medium |
|
2020-11-18 00:25:19 |
Colin Ian King |
zfs-linux (Ubuntu): assignee |
|
Colin Ian King (colin-king) |
|
2020-11-27 10:48:06 |
Colin Ian King |
nominated for series |
|
Ubuntu Groovy |
|
2020-11-27 10:48:06 |
Colin Ian King |
bug task added |
|
zfs-linux (Ubuntu Groovy) |
|
2020-11-27 10:48:06 |
Colin Ian King |
nominated for series |
|
Ubuntu Hirsute |
|
2020-11-27 10:48:06 |
Colin Ian King |
bug task added |
|
zfs-linux (Ubuntu Hirsute) |
|
2020-11-27 10:48:06 |
Colin Ian King |
nominated for series |
|
Ubuntu Focal |
|
2020-11-27 10:48:06 |
Colin Ian King |
bug task added |
|
zfs-linux (Ubuntu Focal) |
|
2020-11-27 10:48:17 |
Colin Ian King |
zfs-linux (Ubuntu Hirsute): status |
In Progress |
Fix Released |
|
2020-11-27 10:48:42 |
Colin Ian King |
zfs-linux (Ubuntu Groovy): status |
New |
Fix Released |
|
2020-11-27 10:48:44 |
Colin Ian King |
zfs-linux (Ubuntu Groovy): importance |
Undecided |
Medium |
|
2020-11-27 10:48:46 |
Colin Ian King |
zfs-linux (Ubuntu Groovy): assignee |
|
Colin Ian King (colin-king) |
|
2020-11-27 10:48:49 |
Colin Ian King |
zfs-linux (Ubuntu Focal): status |
New |
In Progress |
|
2020-11-27 10:48:52 |
Colin Ian King |
zfs-linux (Ubuntu Focal): importance |
Undecided |
High |
|
2020-11-27 10:48:55 |
Colin Ian King |
zfs-linux (Ubuntu Focal): assignee |
|
Colin Ian King (colin-king) |
|
2020-11-27 11:41:41 |
Colin Ian King |
attachment added |
|
upstream patch https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1904589/+attachment/5438715/+files/0001-Bugfix-fix-uio-partial-copies.patch |
|
2020-11-27 15:31:21 |
Andrea Righi |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2020-11-27 17:09:57 |
Andrea Righi |
description |
Using latest zfs 0.8.3-1ubuntu12.4 on latest Ubuntu 20.04.1, I observe a rare zfs panics that seem to be workload-specific which render a server mostly unresponsive besides ssh still working. Attempting to reboot the server in this state makes the shutdown hang forever.
You may want to consider backporting the fix released in zfs 0.8.4 into 20.04: https://github.com/openzfs/zfs/pull/10148
Log sample of panic:
```
Nov 17 16:06:15 hostname kernel: [3385134.716024] PANIC: zfs: accessing past end of object c1c/2db52f (size=17408 access=7492+16428)
Nov 17 16:06:15 hostname kernel: [3385134.716072] Showing stack for process 3166846
Nov 17 16:06:15 hostname kernel: [3385134.716074] CPU: 25 PID: 3166846 Comm: node Tainted: P O 5.4.0-48-generic #52-Ubuntu
Nov 17 16:06:15 hostname kernel: [3385134.716075] Hardware name: <hardware>
Nov 17 16:06:15 hostname kernel: [3385134.716076] Call Trace:
Nov 17 16:06:15 hostname kernel: [3385134.716085] dump_stack+0x6d/0x9a
Nov 17 16:06:15 hostname kernel: [3385134.716097] spl_dumpstack+0x29/0x2b [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716102] vcmn_err.cold+0x60/0x99 [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716106] ? _cond_resched+0x19/0x30
Nov 17 16:06:15 hostname kernel: [3385134.716108] ? __kmalloc_node+0x20e/0x330
Nov 17 16:06:15 hostname kernel: [3385134.716113] ? spl_kmem_alloc_impl+0xa8/0x100 [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716190] ? __list_add+0x17/0x40 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716235] zfs_panic_recover+0x6f/0x90 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716272] ? dsl_dir_tempreserve_impl.isra.0.constprop.0+0xed/0x330 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716305] dmu_buf_hold_array_by_dnode+0x3a0/0x490 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716338] dmu_write_uio_dnode+0x4c/0x140 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716370] dmu_write_uio_dbuf+0x4f/0x70 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716416] zfs_write+0xa1f/0xd40 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716419] ? d_absolute_path+0x74/0xb0
Nov 17 16:06:15 hostname kernel: [3385134.716421] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716423] ? __switch_to_asm+0x40/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716424] ? __switch_to_asm+0x40/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716425] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716427] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716474] zpl_write_common_iovec+0xad/0x120 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716567] zpl_iter_write+0x56/0x90 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716570] do_iter_write+0x84/0x1a0
Nov 17 16:06:15 hostname kernel: [3385134.716574] ? futex_wake+0x8b/0x180
Nov 17 16:06:15 hostname kernel: [3385134.716577] do_writev+0x71/0x120
Nov 17 16:06:15 hostname kernel: [3385134.716581] do_syscall_64+0x57/0x190
Nov 17 16:06:15 hostname kernel: [3385134.716584] RIP: 0033:0x7fa366eee0cd
Nov 17 16:06:15 hostname kernel: [3385134.716587] RSP: 002b:00007fa35e7fbde0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
Nov 17 16:06:15 hostname kernel: [3385134.716590] RDX: 000000000000000c RSI: 000000000651c7b0 RDI: 000000000000001d
``` |
[Impact]
zfs_write() doesn't properly account partial copies done by copy_from_user(), causing accesses past the end of objects and triggering kernel panics.
[Test case]
The problem seems to be workload specific, there is not a specific test case to reproduce the problem, but the bug seems to be pretty well identified by the upstream commit reported below.
[Fix]
Apply upstream commit c9e3efdb3a6111b9795becc6594b3c52ba004522 ("Bugfix/fix uio partial copies").
[Regression potential]
Upstream commit that is basically fixing potential out-of-bounds accesses by properly checking partial copies done by copy_from_user() and preventing kernel panics. Regression potential is minimal: it seems unlikely to break other things if this change is applied.
[Original bug report]
Using latest zfs 0.8.3-1ubuntu12.4 on latest Ubuntu 20.04.1, I observe a rare zfs panics that seem to be workload-specific which render a server mostly unresponsive besides ssh still working. Attempting to reboot the server in this state makes the shutdown hang forever.
You may want to consider backporting the fix released in zfs 0.8.4 into 20.04: https://github.com/openzfs/zfs/pull/10148
Log sample of panic:
```
Nov 17 16:06:15 hostname kernel: [3385134.716024] PANIC: zfs: accessing past end of object c1c/2db52f (size=17408 access=7492+16428)
Nov 17 16:06:15 hostname kernel: [3385134.716072] Showing stack for process 3166846
Nov 17 16:06:15 hostname kernel: [3385134.716074] CPU: 25 PID: 3166846 Comm: node Tainted: P O 5.4.0-48-generic #52-Ubuntu
Nov 17 16:06:15 hostname kernel: [3385134.716075] Hardware name: <hardware>
Nov 17 16:06:15 hostname kernel: [3385134.716076] Call Trace:
Nov 17 16:06:15 hostname kernel: [3385134.716085] dump_stack+0x6d/0x9a
Nov 17 16:06:15 hostname kernel: [3385134.716097] spl_dumpstack+0x29/0x2b [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716102] vcmn_err.cold+0x60/0x99 [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716106] ? _cond_resched+0x19/0x30
Nov 17 16:06:15 hostname kernel: [3385134.716108] ? __kmalloc_node+0x20e/0x330
Nov 17 16:06:15 hostname kernel: [3385134.716113] ? spl_kmem_alloc_impl+0xa8/0x100 [spl]
Nov 17 16:06:15 hostname kernel: [3385134.716190] ? __list_add+0x17/0x40 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716235] zfs_panic_recover+0x6f/0x90 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716272] ? dsl_dir_tempreserve_impl.isra.0.constprop.0+0xed/0x330 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716305] dmu_buf_hold_array_by_dnode+0x3a0/0x490 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716338] dmu_write_uio_dnode+0x4c/0x140 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716370] dmu_write_uio_dbuf+0x4f/0x70 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716416] zfs_write+0xa1f/0xd40 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716419] ? d_absolute_path+0x74/0xb0
Nov 17 16:06:15 hostname kernel: [3385134.716421] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716423] ? __switch_to_asm+0x40/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716424] ? __switch_to_asm+0x40/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716425] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716427] ? __switch_to_asm+0x34/0x70
Nov 17 16:06:15 hostname kernel: [3385134.716474] zpl_write_common_iovec+0xad/0x120 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716567] zpl_iter_write+0x56/0x90 [zfs]
Nov 17 16:06:15 hostname kernel: [3385134.716570] do_iter_write+0x84/0x1a0
Nov 17 16:06:15 hostname kernel: [3385134.716574] ? futex_wake+0x8b/0x180
Nov 17 16:06:15 hostname kernel: [3385134.716577] do_writev+0x71/0x120
Nov 17 16:06:15 hostname kernel: [3385134.716581] do_syscall_64+0x57/0x190
Nov 17 16:06:15 hostname kernel: [3385134.716584] RIP: 0033:0x7fa366eee0cd
Nov 17 16:06:15 hostname kernel: [3385134.716587] RSP: 002b:00007fa35e7fbde0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
Nov 17 16:06:15 hostname kernel: [3385134.716590] RDX: 000000000000000c RSI: 000000000651c7b0 RDI: 000000000000001d
``` |
|
2020-11-30 09:44:25 |
Andrea Righi |
attachment added |
|
zfs-fix-uio-partial-copies.debdiff https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1904589/+attachment/5439545/+files/zfs-fix-uio-partial-copies.debdiff |
|
2021-06-10 12:52:54 |
Snorre Selmer |
bug |
|
|
added subscriber Snorre Selmer |
2021-06-11 11:01:48 |
Timo Aaltonen |
zfs-linux (Ubuntu Focal): status |
In Progress |
Fix Committed |
|
2021-06-11 11:01:52 |
Timo Aaltonen |
bug |
|
|
added subscriber SRU Verification |
2021-06-29 21:08:03 |
Colin Ian King |
tags |
|
verification-done-focal |
|
2021-06-29 21:08:12 |
Colin Ian King |
tags |
verification-done-focal |
verification-done verification-done-focal |
|
2021-07-01 10:18:14 |
Ćukasz Zemczak |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|
2021-07-01 10:18:12 |
Launchpad Janitor |
zfs-linux (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|