Linux md raid-10 freezes during resync

Bug #1767992 reported by Sergey Kirillov on 2018-04-30
42
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Bionic
Medium
Connor Kuehl

Bug Description

[Impact]

* If regular and resync IO happen at the same time during a regular IO
  split, the split bio will wait until resync IO finishes while at the
  same time the resync IO is waiting for regular IO to finish. This
  results in deadlock.

* I believe this only impacts Bionic as Disco+ already contains this
  commit. Xenial doesn't contain the commit that this one fixes.

[Test Case]

The test kernel containing this commit received positive feedback in the
launchpad bug.

From the launchpad bug comment #10:

"For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd,
unpack anaconda archives, after minute or two deadlocked"

[Regression Potential]

* This fix has been in mainline since December 2018 and I don't see any
  fixup commits upstream referencing this one. The small number of
  changes in this commit seem reasonable for managing the `nr_pending`
  adjustments which preclude either regular or resync IO.

Original bug description follows:
---------------------------------

I'm trying to setup a few nodes with software raid-10.

When array is created and resync is running i'm trying to install a few packages and frequently system stops responding, resync process stops, and I'm getting following errors in the kernel log.

This looks like a deadlock for me.

I had this problem in both 18.04 and 16.04. Reboot is the only way to fix the node.

[ 2659.317256] INFO: task kworker/u24:13:343 blocked for more than 120 seconds.
[ 2659.317313] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.317350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.317401] kworker/u24:13 D 0 343 2 0x80000000
[ 2659.317414] Workqueue: writeback wb_workfn (flush-9:1)
[ 2659.317417] Call Trace:
[ 2659.317430] __schedule+0x297/0x8b0
[ 2659.317435] schedule+0x2c/0x80
[ 2659.317443] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.317449] ? wait_woken+0x80/0x80
[ 2659.317454] raid10_write_request+0x77/0x950 [raid10]
[ 2659.317459] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.317465] ? mempool_alloc+0x71/0x190
[ 2659.317469] ? ___slab_alloc+0x20a/0x4b0
[ 2659.317475] ? md_write_start+0xc8/0x200
[ 2659.317480] ? mempool_alloc_slab+0x15/0x20
[ 2659.317484] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.317489] md_handle_request+0x126/0x1a0
[ 2659.317494] md_make_request+0x6b/0x150
[ 2659.317501] generic_make_request+0x124/0x300
[ 2659.317506] submit_bio+0x73/0x150
[ 2659.317510] ? submit_bio+0x73/0x150
[ 2659.317579] xfs_submit_ioend+0x87/0x1c0 [xfs]
[ 2659.317626] xfs_do_writepage+0x377/0x6a0 [xfs]
[ 2659.317632] write_cache_pages+0x20c/0x4e0
[ 2659.317674] ? xfs_vm_writepages+0xf0/0xf0 [xfs]
[ 2659.317682] ? intel_pstate_update_pstate+0x40/0x40
[ 2659.317687] ? update_load_avg+0x5c5/0x6e0
[ 2659.317727] xfs_vm_writepages+0xbe/0xf0 [xfs]
[ 2659.317732] do_writepages+0x4b/0xe0
[ 2659.317738] ? check_preempt_curr+0x83/0x90
[ 2659.317742] ? ttwu_do_wakeup+0x1e/0x150
[ 2659.317746] __writeback_single_inode+0x45/0x340
[ 2659.317749] ? __writeback_single_inode+0x45/0x340
[ 2659.317752] writeback_sb_inodes+0x1e1/0x510
[ 2659.317756] __writeback_inodes_wb+0x67/0xb0
[ 2659.317759] wb_writeback+0x271/0x300
[ 2659.317764] wb_workfn+0x180/0x410
[ 2659.317766] ? wb_workfn+0x180/0x410
[ 2659.317773] process_one_work+0x1de/0x410
[ 2659.317776] worker_thread+0x32/0x410
[ 2659.317781] kthread+0x121/0x140
[ 2659.317784] ? process_one_work+0x410/0x410
[ 2659.317788] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.317793] ? do_syscall_64+0x73/0x130
[ 2659.317797] ? SyS_exit_group+0x14/0x20
[ 2659.317801] ret_from_fork+0x35/0x40
[ 2659.317806] INFO: task md1_resync:429 blocked for more than 120 seconds.
[ 2659.317853] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.317889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.317940] md1_resync D 0 429 2 0x80000000
[ 2659.317943] Call Trace:
[ 2659.317949] __schedule+0x297/0x8b0
[ 2659.317954] schedule+0x2c/0x80
[ 2659.317959] raise_barrier+0xa1/0x1a0 [raid10]
[ 2659.317963] ? wait_woken+0x80/0x80
[ 2659.317968] raid10_sync_request+0x205/0x1f10 [raid10]
[ 2659.317975] ? find_next_bit+0xb/0x10
[ 2659.317980] ? cpumask_next+0x1b/0x20
[ 2659.317985] ? is_mddev_idle+0x92/0xf4
[ 2659.317990] md_do_sync+0x8ca/0xf10
[ 2659.317994] ? wait_woken+0x80/0x80
[ 2659.318000] md_thread+0x129/0x170
[ 2659.318004] ? mddev_put+0x140/0x140
[ 2659.318007] ? md_thread+0x129/0x170
[ 2659.318012] kthread+0x121/0x140
[ 2659.318015] ? find_pers+0x70/0x70
[ 2659.318019] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.318023] ret_from_fork+0x35/0x40
[ 2659.318031] INFO: task xfsaild/md1p1:701 blocked for more than 120 seconds.
[ 2659.318077] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.318113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.318164] xfsaild/md1p1 D 0 701 2 0x80000000
[ 2659.318167] Call Trace:
[ 2659.318172] __schedule+0x297/0x8b0
[ 2659.318177] ? mempool_alloc_slab+0x15/0x20
[ 2659.318181] schedule+0x2c/0x80
[ 2659.318186] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.318189] ? wait_woken+0x80/0x80
[ 2659.318194] raid10_write_request+0x77/0x950 [raid10]
[ 2659.318198] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.318202] ? mempool_alloc+0x71/0x190
[ 2659.318206] ? md_write_start+0xc8/0x200
[ 2659.318211] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.318215] md_handle_request+0x126/0x1a0
[ 2659.318220] md_make_request+0x6b/0x150
[ 2659.318225] generic_make_request+0x124/0x300
[ 2659.318230] submit_bio+0x73/0x150
[ 2659.318234] ? submit_bio+0x73/0x150
[ 2659.318283] _xfs_buf_ioapply+0x31e/0x4e0 [xfs]
[ 2659.318330] ? xfs_buf_delwri_submit_buffers+0xfa/0x290 [xfs]
[ 2659.318372] xfs_buf_submit+0x65/0x200 [xfs]
[ 2659.318412] ? xfs_buf_submit+0x65/0x200 [xfs]
[ 2659.318451] xfs_buf_delwri_submit_buffers+0xfa/0x290 [xfs]
[ 2659.318492] ? xfs_buf_delwri_submit_nowait+0x10/0x20 [xfs]
[ 2659.318534] xfs_buf_delwri_submit_nowait+0x10/0x20 [xfs]
[ 2659.318573] ? xfs_buf_delwri_submit_nowait+0x10/0x20 [xfs]
[ 2659.318628] xfsaild+0x378/0x7b0 [xfs]
[ 2659.318633] ? __schedule+0x29f/0x8b0
[ 2659.318638] kthread+0x121/0x140
[ 2659.318691] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 2659.318695] ? kthread+0x121/0x140
[ 2659.318744] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 2659.318749] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.318753] ret_from_fork+0x35/0x40
[ 2659.318758] INFO: task journal-offline:17690 blocked for more than 120 seconds.
[ 2659.318806] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.318843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.318893] journal-offline D 0 17690 1 0x00000100
[ 2659.318897] Call Trace:
[ 2659.318902] __schedule+0x297/0x8b0
[ 2659.318907] schedule+0x2c/0x80
[ 2659.318911] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.318915] ? wait_woken+0x80/0x80
[ 2659.318920] raid10_write_request+0x77/0x950 [raid10]
[ 2659.318924] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.318928] ? mempool_alloc+0x71/0x190
[ 2659.318933] ? md_write_start+0xc8/0x200
[ 2659.318938] ? flush_tlb_func_common.constprop.11+0x220/0x220
[ 2659.318942] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.318947] md_handle_request+0x126/0x1a0
[ 2659.318951] md_make_request+0x6b/0x150
[ 2659.318957] generic_make_request+0x124/0x300
[ 2659.318962] submit_bio+0x73/0x150
[ 2659.318966] ? submit_bio+0x73/0x150
[ 2659.319015] xfs_submit_ioend+0x87/0x1c0 [xfs]
[ 2659.319057] xfs_do_writepage+0x377/0x6a0 [xfs]
[ 2659.319063] write_cache_pages+0x20c/0x4e0
[ 2659.319104] ? xfs_vm_writepages+0xf0/0xf0 [xfs]
[ 2659.319144] xfs_vm_writepages+0xbe/0xf0 [xfs]
[ 2659.319149] do_writepages+0x4b/0xe0
[ 2659.319154] __filemap_fdatawrite_range+0xc1/0x100
[ 2659.319158] ? __filemap_fdatawrite_range+0xc1/0x100
[ 2659.319163] file_write_and_wait_range+0x5a/0xb0
[ 2659.319210] xfs_file_fsync+0x5f/0x230 [xfs]
[ 2659.319215] vfs_fsync_range+0x51/0xb0
[ 2659.319218] do_fsync+0x3d/0x70
[ 2659.319222] SyS_fsync+0x10/0x20
[ 2659.319226] do_syscall_64+0x73/0x130
[ 2659.319230] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 2659.319234] RIP: 0033:0x7f3efab64237
[ 2659.319236] RSP: 002b:00007f3ef6267cb0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[ 2659.319240] RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 00007f3efab64237
[ 2659.319241] RDX: 0000000000000000 RSI: 00007f3efa771e9a RDI: 0000000000000017
[ 2659.319243] RBP: 00007f3efa773660 R08: 00007f3ef6268700 R09: 00007f3ef6268700
[ 2659.319245] R10: 0000000000000e9a R11: 0000000000000293 R12: 0000000000000002
[ 2659.319247] R13: 0000000000000000 R14: 00005556663a10b0 R15: 00007ffdce2bd200
[ 2659.319256] INFO: task kworker/1:0:2163 blocked for more than 120 seconds.
[ 2659.319302] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.319339] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.319389] kworker/1:0 D 0 2163 2 0x80000000
[ 2659.319397] Workqueue: md md_submit_flush_data
[ 2659.319399] Call Trace:
[ 2659.319405] __schedule+0x297/0x8b0
[ 2659.319409] ? enqueue_entity+0x10e/0x6b0
[ 2659.319414] schedule+0x2c/0x80
[ 2659.319418] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.319422] ? wait_woken+0x80/0x80
[ 2659.319427] raid10_write_request+0x77/0x950 [raid10]
[ 2659.319431] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.319435] ? mempool_alloc+0x71/0x190
[ 2659.319439] ? md_write_start+0xc8/0x200
[ 2659.319444] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.319448] md_handle_request+0x126/0x1a0
[ 2659.319453] md_make_request+0x6b/0x150
[ 2659.319458] generic_make_request+0x124/0x300
[ 2659.319463] raid10_write_request+0x6bb/0x950 [raid10]
[ 2659.319467] ? raid10_write_request+0x6bb/0x950 [raid10]
[ 2659.319471] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.319474] ? mempool_alloc+0x71/0x190
[ 2659.319478] ? md_write_start+0xc8/0x200
[ 2659.319483] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.319487] md_handle_request+0x126/0x1a0
[ 2659.319492] md_submit_flush_data+0x54/0x70
[ 2659.319495] process_one_work+0x1de/0x410
[ 2659.319498] worker_thread+0x32/0x410
[ 2659.319503] kthread+0x121/0x140
[ 2659.319506] ? process_one_work+0x410/0x410
[ 2659.319510] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.319513] ret_from_fork+0x35/0x40
[ 2659.319522] INFO: task kworker/2:4:9775 blocked for more than 120 seconds.
[ 2659.319568] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.319604] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.319655] kworker/2:4 D 0 9775 2 0x80000000
[ 2659.319661] Workqueue: md md_submit_flush_data
[ 2659.319663] Call Trace:
[ 2659.319668] __schedule+0x297/0x8b0
[ 2659.319673] schedule+0x2c/0x80
[ 2659.319677] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.319681] ? wait_woken+0x80/0x80
[ 2659.319685] raid10_write_request+0x77/0x950 [raid10]
[ 2659.319689] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.319693] ? mempool_alloc+0x71/0x190
[ 2659.319697] ? md_write_start+0xc8/0x200
[ 2659.319702] raid10_make_request+0xcc/0x140 [raid10]
[ 2659.319706] md_handle_request+0x126/0x1a0
[ 2659.319711] md_submit_flush_data+0x54/0x70
[ 2659.319714] process_one_work+0x1de/0x410
[ 2659.319717] worker_thread+0x32/0x410
[ 2659.319721] kthread+0x121/0x140
[ 2659.319724] ? process_one_work+0x410/0x410
[ 2659.319728] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.319732] ? do_syscall_64+0x73/0x130
[ 2659.319735] ? SyS_exit_group+0x14/0x20
[ 2659.319738] ret_from_fork+0x35/0x40
[ 2659.319750] INFO: task dpkg:17453 blocked for more than 120 seconds.
[ 2659.319792] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.319829] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.319879] dpkg D 0 17453 17180 0x00000000
[ 2659.319882] Call Trace:
[ 2659.319887] __schedule+0x297/0x8b0
[ 2659.319944] ? xlog_sync+0x2c1/0x3b0 [xfs]
[ 2659.319949] schedule+0x2c/0x80
[ 2659.319999] _xfs_log_force_lsn+0x1fa/0x330 [xfs]
[ 2659.320003] ? file_check_and_advance_wb_err+0x35/0xe0
[ 2659.320008] ? wake_up_q+0x80/0x80
[ 2659.320056] xfs_file_fsync+0xfd/0x230 [xfs]
[ 2659.320061] vfs_fsync_range+0x51/0xb0
[ 2659.320064] do_fsync+0x3d/0x70
[ 2659.320067] SyS_fsync+0x10/0x20
[ 2659.320071] do_syscall_64+0x73/0x130
[ 2659.320076] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 2659.320078] RIP: 0033:0x7f97758f9214
[ 2659.320080] RSP: 002b:00007ffe2cd910e8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[ 2659.320083] RAX: ffffffffffffffda RBX: 00005622a49c6300 RCX: 00007f97758f9214
[ 2659.320085] RDX: 0000000000000001 RSI: 00005622a49ce340 RDI: 000000000000000b
[ 2659.320086] RBP: 00005622a3a35174 R08: 00005622a49c6373 R09: 0000000000000000
[ 2659.320088] R10: 0000000000000000 R11: 0000000000000246 R12: 00005622a49c6210
[ 2659.320090] R13: 00005622a3a35173 R14: 000000000000000b R15: 00005622a49ce340
[ 2659.320094] INFO: task kworker/4:2:17583 blocked for more than 120 seconds.
[ 2659.320140] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.320176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.320226] kworker/4:2 D 0 17583 2 0x80000000
[ 2659.320281] Workqueue: xfs-sync/md1p1 xfs_log_worker [xfs]
[ 2659.320283] Call Trace:
[ 2659.320289] __schedule+0x297/0x8b0
[ 2659.320293] schedule+0x2c/0x80
[ 2659.320298] md_flush_request+0x6c/0x120
[ 2659.320302] ? wait_woken+0x80/0x80
[ 2659.320307] raid10_make_request+0x106/0x140 [raid10]
[ 2659.320311] md_handle_request+0x126/0x1a0
[ 2659.320316] md_make_request+0x6b/0x150
[ 2659.320322] generic_make_request+0x124/0x300
[ 2659.320326] submit_bio+0x73/0x150
[ 2659.320330] ? submit_bio+0x73/0x150
[ 2659.320378] _xfs_buf_ioapply+0x31e/0x4e0 [xfs]
[ 2659.320385] ? kernel_fpu_end+0xe/0x10
[ 2659.320433] ? xlog_bdstrat+0x2b/0x60 [xfs]
[ 2659.320479] xfs_buf_submit+0x65/0x200 [xfs]
[ 2659.320520] ? xfs_buf_submit+0x65/0x200 [xfs]
[ 2659.320567] xlog_bdstrat+0x2b/0x60 [xfs]
[ 2659.320611] xlog_sync+0x2c1/0x3b0 [xfs]
[ 2659.320654] xlog_state_release_iclog+0x6c/0xc0 [xfs]
[ 2659.320696] _xfs_log_force+0x21e/0x290 [xfs]
[ 2659.320702] ? __switch_to+0xad/0x500
[ 2659.320744] ? xfs_log_worker+0x34/0x100 [xfs]
[ 2659.320784] xfs_log_force+0x2c/0x80 [xfs]
[ 2659.320824] xfs_log_worker+0x34/0x100 [xfs]
[ 2659.320828] process_one_work+0x1de/0x410
[ 2659.320832] worker_thread+0x32/0x410
[ 2659.320836] kthread+0x121/0x140
[ 2659.320839] ? process_one_work+0x410/0x410
[ 2659.320843] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2659.320847] ? do_syscall_64+0x73/0x130
[ 2659.320850] ? SyS_exit_group+0x14/0x20
[ 2659.320854] ret_from_fork+0x35/0x40
[ 2659.320858] INFO: task bash:17617 blocked for more than 120 seconds.
[ 2659.320901] Not tainted 4.15.0-20-generic #21-Ubuntu
[ 2659.320938] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.320988] bash D 0 17617 5271 0x80000000
[ 2659.320991] Call Trace:
[ 2659.320997] __schedule+0x297/0x8b0
[ 2659.321002] schedule+0x2c/0x80
[ 2659.321006] wait_barrier+0x146/0x1a0 [raid10]
[ 2659.321010] ? wait_woken+0x80/0x80
[ 2659.321015] raid10_read_request+0xa4/0x4a0 [raid10]
[ 2659.321019] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[ 2659.321023] ? mempool_alloc+0x71/0x190
[ 2659.321027] ? mempool_alloc_slab+0x15/0x20
[ 2659.321030] ? kmem_cache_alloc+0xa2/0x1b0
[ 2659.321036] ? radix_tree_node_alloc.constprop.19+0x7f/0xc0
[ 2659.321040] raid10_make_request+0xff/0x140 [raid10]
[ 2659.321045] md_handle_request+0x126/0x1a0
[ 2659.321050] md_make_request+0x6b/0x150
[ 2659.321055] generic_make_request+0x124/0x300
[ 2659.321060] submit_bio+0x73/0x150
[ 2659.321064] ? submit_bio+0x73/0x150
[ 2659.321070] ? guard_bio_eod+0x2c/0xf0
[ 2659.321073] mpage_readpages+0x1ae/0x1e0
[ 2659.321123] ? xfs_map_at_offset+0x30/0x30 [xfs]
[ 2659.321168] xfs_vm_readpages+0x35/0x90 [xfs]
[ 2659.321173] __do_page_cache_readahead+0x1c9/0x2c0
[ 2659.321178] ondemand_readahead+0x11a/0x2a0
[ 2659.321204] ? find_get_entry+0x1e/0x110
[ 2659.321207] ? ondemand_readahead+0x11a/0x2a0
[ 2659.321212] page_cache_sync_readahead+0x2e/0x50
[ 2659.321215] generic_file_read_iter+0x7fb/0xbf0
[ 2659.321221] ? _cond_resched+0x19/0x40
[ 2659.321224] ? down_read+0x12/0x40
[ 2659.321273] xfs_file_buffered_aio_read+0x5a/0x100 [xfs]
[ 2659.321322] xfs_file_read_iter+0x68/0xc0 [xfs]
[ 2659.321334] new_sync_read+0xe4/0x130
[ 2659.321344] __vfs_read+0x29/0x40
[ 2659.321355] vfs_read+0x8e/0x130
[ 2659.321365] kernel_read+0x31/0x50
[ 2659.321372] prepare_binprm+0xef/0x1d0
[ 2659.321376] do_execveat_common.isra.34+0x594/0x810
[ 2659.321382] SyS_execve+0x31/0x40
[ 2659.321391] do_syscall_64+0x73/0x130
[ 2659.321401] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 2659.321409] RIP: 0033:0x7fb415617e37
[ 2659.321414] RSP: 002b:00007fffde7393e8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 2659.321428] RAX: ffffffffffffffda RBX: 0000560263024ab0 RCX: 00007fb415617e37
[ 2659.321432] RDX: 0000560262ffb1c0 RSI: 0000560263024b10 RDI: 0000560263024ab0
[ 2659.321438] RBP: 0000560263024ab0 R08: 00007fffde7393b0 R09: 0000000000000000
[ 2659.321445] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000001
[ 2659.321451] R13: 0000560263024b10 R14: 0000560262ffb1c0 R15: 0000000000000000

description: updated
Download full text (7.1 KiB)

Same here, md raid10 (not sure if its important but xfs, hung 3 times in last 2 days).
It seems that a combination of md raid10 check + I/O (maybe XFS-specific, I dunno, but both original poster and us seem to use XFS) frequently hangs on kernels that are newer than Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 (yes, I know it's a wide range, but everything started happening after we rebooted this machine, which upgraded us from Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 to Ubuntu 4.13.0-43.48~16.04.1-generic 4.13.16, the check was scheduled some time later, so we didn't catch it immediately).

The logs look very similar:
Jun 13 19:15:42 pisces kernel: [27430.370899] INFO: task md7_resync:12982 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.370940] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.370966] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.370997] md7_resync D 0 12982 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371000] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371012] __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371014] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371020] raise_barrier+0xd2/0x1a0 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371024] ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371027] raid10_sync_request+0x9bd/0x1b10 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371031] ? pick_next_task_fair+0x449/0x570
Jun 13 19:15:42 pisces kernel: [27430.371035] ? __switch_to+0xb2/0x540
Jun 13 19:15:42 pisces kernel: [27430.371041] ? find_next_bit+0xb/0x10
Jun 13 19:15:42 pisces kernel: [27430.371046] ? is_mddev_idle+0xa1/0x101
Jun 13 19:15:42 pisces kernel: [27430.371048] md_do_sync+0xb81/0xfb0
Jun 13 19:15:42 pisces kernel: [27430.371050] ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371054] md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371055] ? md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371060] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371062] ? state_show+0x320/0x320
Jun 13 19:15:42 pisces kernel: [27430.371064] ? kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371067] ret_from_fork+0x35/0x40
Jun 13 19:15:42 pisces kernel: [27430.371181] INFO: task kworker/20:1:27873 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371210] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.371267] kworker/20:1 D 0 27873 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371333] Workqueue: xfs-sync/md7 xfs_log_worker [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371334] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371338] __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371340] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371342] schedule_timeout+0x1f3/0x360
Jun 13 19:15:42 pisces kernel: [27430.371347] ? scsi_ini...

Read more...

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed

Same problem here reproducible on Ubuntu 18.04 LTS. SATA RAID 10 with XFS filesystem. On each reboot the RAID does a resync. Any operation on the RAID before the resync is finished causes the process to hang. Might be also associated with another issue that causes the resync every time on boot.

Sergey Kirillov (rushman) wrote :

This issue is gone after we migrated from XFS to EXT4, so it is definitely XFS-related.

na (utsaxc) wrote :

I can confirm this affected me as well. My RAID-10 is on a separate filesystem from the Operating System's SSD -- I have the RAID-10 mounted at '/raid10' on my server. My application is installed on the non-mdadm SSD at '/opt' but it reads/writes/executes files from the '/raid10' filesystem (made up of 6x8TB drives with 1 spare).

During a recovery from a failed drive, it would periodically run into the same issue mentioned here while performing write operations and was less likely to occur during read operations but it still did happen. I noticed I'd have to reboot about every 30-60min due to this hang-up that stopped the rebuild from continuing -- once the server was back online, it would continue.

However, I got fed up with this because the rebuild/resync is supposed to take approximately 9ish hours and it was only 29% complete so on the last reboot I stopped my application and unmounted the RAID-10 (umount /raid10). Once I did that, this continue rebuilding through the night without issue and completed the remaining 71%.

... So it seems like you shouldn't interact with the RAID in the latest kernel/xfs/mdadm on Ubuntu 18.04. Here's the current versions I'm running:

==========
root@server:~# uname -r
4.15.0-36-generic

root@server:~# dpkg -l | awk '/mdadm/ || /xfsprog/ {print $2,$3}' | column -t
mdadm 4.1~rc1-3~ubuntu18.04.1
xfsprogs 4.9.0+nmu1ubuntu2
==========

Previously with Ubuntu 16.04 with the 4.4 kernel and latest mdadm/xfsprogs for 16.04 I didn't have this issue.

johnny (jbassi) wrote :

We had the same issue after updating from Ubuntu 14.04 to 16.04 and then 18.04 (the two consecutive updates were applied on the same day).
The filesystem is also XFS. and the setup is with raid10.
The bug does not completely froze the system. Many process still run and were able to write to disk. But new processes seems to have trouble with that.
Also, some processes that did not write to disk also halted. Example: htop and ps aux
Interesting thing is that "ps aux" actually started printing the output and around the middle of the execution, it halted, and Ctrl+C was unable to exit the program.
htop, which hangs a few seconds after its execution, showed that the frozen processed were in D (uninterruptible sleep) state.

The fix was to downgrade from kernel 4.15.0-34-generic to 4.2.0-34-generic.
Although the fix was not perfect, since that fix it happened again once. (but once in 130 days is not as bad)

Attached logs: kernel.log with some stracktrace, but they are more or less the same as previously posted.. (the logs were OCR'ed from some screenshots. so beware of some characters. like the 0 which sometimes turns into a 6)

Damir Chanyshev (trinar) on 2019-06-25
affects: ubuntu → linux (Ubuntu)
Connor Kuehl (connork) on 2019-07-10
Changed in linux (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Connor Kuehl (connork)
Connor Kuehl (connork) wrote :

Hi everyone,

I was just curious if anyone's had a chance to test out Kai-Heng's test kernel from comment #8?

Damir Chanyshev (trinar) wrote :

Kernel from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/comments/8 solved my issues.

For reproduce on 4.15.0-50-generic
Make new raid-10, add some io fio/dd, unpack anaconda archives(lol), after minute or two deadlocked

[Mon Jul 15 19:31:11 2019] Workqueue: md md_submit_flush_data
[Mon Jul 15 19:31:11 2019] Call Trace:
[Mon Jul 15 19:31:11 2019] __schedule+0x291/0x8a0
[Mon Jul 15 19:31:11 2019] ? __switch_to_asm+0x34/0x70
[Mon Jul 15 19:31:11 2019] ? __switch_to_asm+0x40/0x70
[Mon Jul 15 19:31:11 2019] schedule+0x2c/0x80
[Mon Jul 15 19:31:11 2019] wait_barrier+0x146/0x1a0 [raid10]
[Mon Jul 15 19:31:11 2019] ? wait_woken+0x80/0x80
[Mon Jul 15 19:31:11 2019] raid10_write_request+0x77/0x950 [raid10]
[Mon Jul 15 19:31:11 2019] ? r10bio_pool_alloc+0x24/0x30 [raid10]
[Mon Jul 15 19:31:11 2019] ? mempool_alloc+0x71/0x190
[Mon Jul 15 19:31:11 2019] ? md_write_start+0xf4/0x210
[Mon Jul 15 19:31:11 2019] ? default_wake_function+0x12/0x20
[Mon Jul 15 19:31:11 2019] ? autoremove_wake_function+0x12/0x40
[Mon Jul 15 19:31:11 2019] raid10_make_request+0xcc/0x140 [raid10]
[Mon Jul 15 19:31:11 2019] md_handle_request+0x126/0x1a0
[Mon Jul 15 19:31:11 2019] md_submit_flush_data+0x54/0x70
[Mon Jul 15 19:31:11 2019] process_one_work+0x1de/0x410
[Mon Jul 15 19:31:11 2019] worker_thread+0x32/0x410
[Mon Jul 15 19:31:11 2019] kthread+0x121/0x140
[Mon Jul 15 19:31:11 2019] ? process_one_work+0x410/0x410
[Mon Jul 15 19:31:11 2019] ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 15 19:31:11 2019] ret_from_fork+0x35/0x40

Connor Kuehl (connork) wrote :

Thank you for testing that and providing a test case, Damir!

I've gone ahead and submitted that patch to the mailing list: https://lists.ubuntu.com/archives/kernel-team/2019-July/102199.html

description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Brad Figg (brad-figg) on 2019-07-24
tags: added: cscc

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Connor Kuehl (connork) wrote :

Hi Damir (or anyone else affected by this that may be reading the bug),

If you get a chance, could you please test the kernel that is in -proposed to verify that it resolves the issue for you? There are instructions in comment #12 for verifying this for Bionic or if you're using Xenial, the instructions are in comment #13.

Thanks,

Connor

Launchpad Janitor (janitor) wrote :
Download full text (171.3 KiB)

This bug was fixed in the package linux - 4.15.0-58.64

---------------
linux (4.15.0-58.64) bionic; urgency=medium

  * unable to handle kernel NULL pointer dereference at 000000000000002c (IP:
    iget5_locked+0x9e/0x1f0) (LP: #1838982)
    - Revert "ovl: set I_CREATING on inode being created"
    - Revert "new primitive: discard_new_inode()"

linux (4.15.0-57.63) bionic; urgency=medium

  * CVE-2019-1125
    - x86/cpufeatures: Carve out CQM features retrieval
    - x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
    - x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
    - x86/speculation: Enable Spectre v1 swapgs mitigations
    - x86/entry/64: Use JMP instead of JMPQ
    - x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS

  * Packaging resync (LP: #1786013)
    - update dkms package versions

linux (4.15.0-56.62) bionic; urgency=medium

  * bionic/linux: 4.15.0-56.62 -proposed tracker (LP: #1837626)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync git-ubuntu-log
    - [Packaging] update helper scripts

  * CVE-2019-2101
    - media: uvcvideo: Fix 'type' check leading to overflow

  * hibmc-drm Causes Unreadable Display for Huawei amd64 Servers (LP: #1762940)
    - [Config] Set CONFIG_DRM_HISI_HIBMC to arm64 only
    - SAUCE: Make CONFIG_DRM_HISI_HIBMC depend on ARM64

  * Bionic: support for Solarflare X2542 network adapter (sfc driver)
    (LP: #1836635)
    - sfc: make mem_bar a function rather than a constant
    - sfc: support VI strides other than 8k
    - sfc: add Medford2 (SFC9250) PCI Device IDs
    - sfc: improve PTP error reporting
    - sfc: update EF10 register definitions
    - sfc: populate the timer reload field
    - sfc: update MCDI protocol headers
    - sfc: support variable number of MAC stats
    - sfc: expose FEC stats on Medford2
    - sfc: expose CTPIO stats on NICs that support them
    - sfc: basic MCDI mapping of 25/50/100G link speeds
    - sfc: support the ethtool ksettings API properly so that 25/50/100G works
    - sfc: add bits for 25/50/100G supported/advertised speeds
    - sfc: remove tx and MCDI handling from NAPI budget consideration
    - sfc: handle TX timestamps in the normal data path
    - sfc: add function to determine which TX timestamping method to use
    - sfc: use main datapath for HW timestamps if available
    - sfc: only enable TX timestamping if the adapter is licensed for it
    - sfc: MAC TX timestamp handling on the 8000 series
    - sfc: on 8000 series use TX queues for TX timestamps
    - sfc: only advertise TX timestamping if we have the license for it
    - sfc: simplify RX datapath timestamping
    - sfc: support separate PTP and general timestamping
    - sfc: support second + quarter ns time format for receive datapath
    - sfc: support Medford2 frequency adjustment format
    - sfc: add suffix to large constant in ptp
    - sfc: mark some unexported symbols as static
    - sfc: update MCDI protocol headers
    - sfc: support FEC configuration through ethtool
    - sfc: remove ctpio_dmabuf_start from stats
    - sfc: stop the TX queue before pushing new buffers

  * [18.04 FEAT] zKVM: Add hardwar...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments