Same here, md raid10 (not sure if its important but xfs, hung 3 times in last 2 days).
It seems that a combination of md raid10 check + I/O (maybe XFS-specific, I dunno, but both original poster and us seem to use XFS) frequently hangs on kernels that are newer than Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 (yes, I know it's a wide range, but everything started happening after we rebooted this machine, which upgraded us from Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 to Ubuntu 4.13.0-43.48~16.04.1-generic 4.13.16, the check was scheduled some time later, so we didn't catch it immediately).
The logs look very similar:
Jun 13 19:15:42 pisces kernel: [27430.370899] INFO: task md7_resync:12982 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.370940] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.370966] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.370997] md7_resync D 0 12982 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371000] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371012] __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371014] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371020] raise_barrier+0xd2/0x1a0 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371024] ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371027] raid10_sync_request+0x9bd/0x1b10 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371031] ? pick_next_task_fair+0x449/0x570
Jun 13 19:15:42 pisces kernel: [27430.371035] ? __switch_to+0xb2/0x540
Jun 13 19:15:42 pisces kernel: [27430.371041] ? find_next_bit+0xb/0x10
Jun 13 19:15:42 pisces kernel: [27430.371046] ? is_mddev_idle+0xa1/0x101
Jun 13 19:15:42 pisces kernel: [27430.371048] md_do_sync+0xb81/0xfb0
Jun 13 19:15:42 pisces kernel: [27430.371050] ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371054] md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371055] ? md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371060] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371062] ? state_show+0x320/0x320
Jun 13 19:15:42 pisces kernel: [27430.371064] ? kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371067] ret_from_fork+0x35/0x40
Jun 13 19:15:42 pisces kernel: [27430.371181] INFO: task kworker/20:1:27873 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371210] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.371267] kworker/20:1 D 0 27873 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371333] Workqueue: xfs-sync/md7 xfs_log_worker [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371334] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371338] __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371340] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371342] schedule_timeout+0x1f3/0x360
Jun 13 19:15:42 pisces kernel: [27430.371347] ? scsi_init_rq+0x84/0x100
Jun 13 19:15:42 pisces kernel: [27430.371349] wait_for_completion+0xb4/0x140
Jun 13 19:15:42 pisces kernel: [27430.371351] ? wait_for_completion+0xb4/0x140
Jun 13 19:15:42 pisces kernel: [27430.371356] ? wake_up_q+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371360] flush_work+0x129/0x1e0
Jun 13 19:15:42 pisces kernel: [27430.371363] ? worker_detach_from_pool+0xb0/0xb0
Jun 13 19:15:42 pisces kernel: [27430.371397] xlog_cil_force_lsn+0x8b/0x220 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371400] ? update_curr+0x138/0x1d0
Jun 13 19:15:42 pisces kernel: [27430.371433] _xfs_log_force+0x85/0x290 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371436] ? pick_next_task_fair+0x131/0x570
Jun 13 19:15:42 pisces kernel: [27430.371438] ? __switch_to+0xb2/0x540
Jun 13 19:15:42 pisces kernel: [27430.371471] ? xfs_log_worker+0x36/0x100 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371505] xfs_log_force+0x2c/0x80 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371538] xfs_log_worker+0x36/0x100 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371541] process_one_work+0x15b/0x410
Jun 13 19:15:42 pisces kernel: [27430.371544] worker_thread+0x4b/0x460
Jun 13 19:15:42 pisces kernel: [27430.371546] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371548] ? process_one_work+0x410/0x410
Jun 13 19:15:42 pisces kernel: [27430.371550] ? kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371552] ret_from_fork+0x35/0x40
Jun 13 19:15:42 pisces kernel: [27430.371557] INFO: task kworker/20:0:4504 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371587] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.371642] kworker/20:0 D 0 4504 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371680] Workqueue: xfs-cil/md7 xlog_cil_push_work [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371681] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371684] __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371687] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371689] md_flush_request+0x6e/0x130
Jun 13 19:15:42 pisces kernel: [27430.371691] ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371694] raid10_make_request+0x12a/0x130 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371696] md_handle_request+0xb5/0x130
Jun 13 19:15:42 pisces kernel: [27430.371698] md_make_request+0x6c/0x170
Jun 13 19:15:42 pisces kernel: [27430.371702] generic_make_request+0x12a/0x300
Jun 13 19:15:42 pisces kernel: [27430.371704] submit_bio+0x73/0x150
Jun 13 19:15:42 pisces kernel: [27430.371706] ? submit_bio+0x73/0x150
Jun 13 19:15:42 pisces kernel: [27430.371737] _xfs_buf_ioapply+0x2e7/0x4a0 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371741] ? kernel_fpu_end+0xe/0x10
Jun 13 19:15:42 pisces kernel: [27430.371775] ? xlog_bdstrat+0x2b/0x60 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371807] xfs_buf_submit+0x63/0x210 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371838] ? xfs_buf_submit+0x63/0x210 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371872] xlog_bdstrat+0x2b/0x60 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371907] xlog_sync+0x2c1/0x3c0 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371949] xlog_state_release_iclog+0x76/0xc0 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371985] xlog_write+0x55c/0x720 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.372020] xlog_cil_push+0x22b/0x3f0 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.372054] xlog_cil_push_work+0x15/0x20 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.372057] process_one_work+0x15b/0x410
Jun 13 19:15:42 pisces kernel: [27430.372059] worker_thread+0x22b/0x460
Jun 13 19:15:42 pisces kernel: [27430.372061] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.372063] ? process_one_work+0x410/0x410
Jun 13 19:15:42 pisces kernel: [27430.372065] ? kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.372067] ret_from_fork+0x35/0x40
...
Same here, md raid10 (not sure if its important but xfs, hung 3 times in last 2 days). 42.46~16. 04.1-generic 4.10.17 (yes, I know it's a wide range, but everything started happening after we rebooted this machine, which upgraded us from Ubuntu 4.10.0- 42.46~16. 04.1-generic 4.10.17 to Ubuntu 4.13.0- 43.48~16. 04.1-generic 4.13.16, the check was scheduled some time later, so we didn't catch it immediately).
It seems that a combination of md raid10 check + I/O (maybe XFS-specific, I dunno, but both original poster and us seem to use XFS) frequently hangs on kernels that are newer than Ubuntu 4.10.0-
The logs look very similar: kernel/ hung_task_ timeout_ secs" disables this message. 0x3d6/0x8b0 0xd2/0x1a0 [raid10] 0x80/0x80 sync_request+ 0x9bd/0x1b10 [raid10] task_fair+ 0x449/0x570 to+0xb2/ 0x540 bit+0xb/ 0x10 idle+0xa1/ 0x101 0xb81/0xfb0 0x80/0x80 0x133/0x180 0x133/0x180 0x320/0x320 create_ on_node+ 0x70/0x70 fork+0x35/ 0x40 kernel/ hung_task_ timeout_ secs" disables this message. 0x3d6/0x8b0 timeout+ 0x1f3/0x360 rq+0x84/ 0x100 completion+ 0xb4/0x140 completion+ 0xb4/0x140 0x129/0x1e0 detach_ from_pool+ 0xb0/0xb0 force_lsn+ 0x8b/0x220 [xfs] curr+0x138/ 0x1d0 force+0x85/ 0x290 [xfs] task_fair+ 0x131/0x570 to+0xb2/ 0x540 worker+ 0x36/0x100 [xfs] force+0x2c/ 0x80 [xfs] worker+ 0x36/0x100 [xfs] one_work+ 0x15b/0x410 thread+ 0x4b/0x460 one_work+ 0x410/0x410 create_ on_node+ 0x70/0x70 fork+0x35/ 0x40 kernel/ hung_task_ timeout_ secs" disables this message. 0x3d6/0x8b0 request+ 0x6e/0x130 0x80/0x80 make_request+ 0x12a/0x130 [raid10] request+ 0xb5/0x130 request+ 0x6c/0x170 make_request+ 0x12a/0x300 bio+0x73/ 0x150 bio+0x73/ 0x150 ioapply+ 0x2e7/0x4a0 [xfs] fpu_end+ 0xe/0x10 0x2b/0x60 [xfs] submit+ 0x63/0x210 [xfs] submit+ 0x63/0x210 [xfs] 0x2b/0x60 [xfs] 0x2c1/0x3c0 [xfs] release_ iclog+0x76/ 0xc0 [xfs] 0x55c/0x720 [xfs] push+0x22b/ 0x3f0 [xfs] push_work+ 0x15/0x20 [xfs] one_work+ 0x15b/0x410 thread+ 0x22b/0x460 one_work+ 0x410/0x410 create_ on_node+ 0x70/0x70 fork+0x35/ 0x40
Jun 13 19:15:42 pisces kernel: [27430.370899] INFO: task md7_resync:12982 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.370940] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.370966] "echo 0 > /proc/sys/
Jun 13 19:15:42 pisces kernel: [27430.370997] md7_resync D 0 12982 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371000] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371012] __schedule+
Jun 13 19:15:42 pisces kernel: [27430.371014] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371020] raise_barrier+
Jun 13 19:15:42 pisces kernel: [27430.371024] ? wait_woken+
Jun 13 19:15:42 pisces kernel: [27430.371027] raid10_
Jun 13 19:15:42 pisces kernel: [27430.371031] ? pick_next_
Jun 13 19:15:42 pisces kernel: [27430.371035] ? __switch_
Jun 13 19:15:42 pisces kernel: [27430.371041] ? find_next_
Jun 13 19:15:42 pisces kernel: [27430.371046] ? is_mddev_
Jun 13 19:15:42 pisces kernel: [27430.371048] md_do_sync+
Jun 13 19:15:42 pisces kernel: [27430.371050] ? wait_woken+
Jun 13 19:15:42 pisces kernel: [27430.371054] md_thread+
Jun 13 19:15:42 pisces kernel: [27430.371055] ? md_thread+
Jun 13 19:15:42 pisces kernel: [27430.371060] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371062] ? state_show+
Jun 13 19:15:42 pisces kernel: [27430.371064] ? kthread_
Jun 13 19:15:42 pisces kernel: [27430.371067] ret_from_
Jun 13 19:15:42 pisces kernel: [27430.371181] INFO: task kworker/20:1:27873 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371210] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371235] "echo 0 > /proc/sys/
Jun 13 19:15:42 pisces kernel: [27430.371267] kworker/20:1 D 0 27873 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371333] Workqueue: xfs-sync/md7 xfs_log_worker [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371334] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371338] __schedule+
Jun 13 19:15:42 pisces kernel: [27430.371340] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371342] schedule_
Jun 13 19:15:42 pisces kernel: [27430.371347] ? scsi_init_
Jun 13 19:15:42 pisces kernel: [27430.371349] wait_for_
Jun 13 19:15:42 pisces kernel: [27430.371351] ? wait_for_
Jun 13 19:15:42 pisces kernel: [27430.371356] ? wake_up_q+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371360] flush_work+
Jun 13 19:15:42 pisces kernel: [27430.371363] ? worker_
Jun 13 19:15:42 pisces kernel: [27430.371397] xlog_cil_
Jun 13 19:15:42 pisces kernel: [27430.371400] ? update_
Jun 13 19:15:42 pisces kernel: [27430.371433] _xfs_log_
Jun 13 19:15:42 pisces kernel: [27430.371436] ? pick_next_
Jun 13 19:15:42 pisces kernel: [27430.371438] ? __switch_
Jun 13 19:15:42 pisces kernel: [27430.371471] ? xfs_log_
Jun 13 19:15:42 pisces kernel: [27430.371505] xfs_log_
Jun 13 19:15:42 pisces kernel: [27430.371538] xfs_log_
Jun 13 19:15:42 pisces kernel: [27430.371541] process_
Jun 13 19:15:42 pisces kernel: [27430.371544] worker_
Jun 13 19:15:42 pisces kernel: [27430.371546] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371548] ? process_
Jun 13 19:15:42 pisces kernel: [27430.371550] ? kthread_
Jun 13 19:15:42 pisces kernel: [27430.371552] ret_from_
Jun 13 19:15:42 pisces kernel: [27430.371557] INFO: task kworker/20:0:4504 blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371587] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371611] "echo 0 > /proc/sys/
Jun 13 19:15:42 pisces kernel: [27430.371642] kworker/20:0 D 0 4504 2 0x80000000
Jun 13 19:15:42 pisces kernel: [27430.371680] Workqueue: xfs-cil/md7 xlog_cil_push_work [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371681] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371684] __schedule+
Jun 13 19:15:42 pisces kernel: [27430.371687] schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371689] md_flush_
Jun 13 19:15:42 pisces kernel: [27430.371691] ? wait_woken+
Jun 13 19:15:42 pisces kernel: [27430.371694] raid10_
Jun 13 19:15:42 pisces kernel: [27430.371696] md_handle_
Jun 13 19:15:42 pisces kernel: [27430.371698] md_make_
Jun 13 19:15:42 pisces kernel: [27430.371702] generic_
Jun 13 19:15:42 pisces kernel: [27430.371704] submit_
Jun 13 19:15:42 pisces kernel: [27430.371706] ? submit_
Jun 13 19:15:42 pisces kernel: [27430.371737] _xfs_buf_
Jun 13 19:15:42 pisces kernel: [27430.371741] ? kernel_
Jun 13 19:15:42 pisces kernel: [27430.371775] ? xlog_bdstrat+
Jun 13 19:15:42 pisces kernel: [27430.371807] xfs_buf_
Jun 13 19:15:42 pisces kernel: [27430.371838] ? xfs_buf_
Jun 13 19:15:42 pisces kernel: [27430.371872] xlog_bdstrat+
Jun 13 19:15:42 pisces kernel: [27430.371907] xlog_sync+
Jun 13 19:15:42 pisces kernel: [27430.371949] xlog_state_
Jun 13 19:15:42 pisces kernel: [27430.371985] xlog_write+
Jun 13 19:15:42 pisces kernel: [27430.372020] xlog_cil_
Jun 13 19:15:42 pisces kernel: [27430.372054] xlog_cil_
Jun 13 19:15:42 pisces kernel: [27430.372057] process_
Jun 13 19:15:42 pisces kernel: [27430.372059] worker_
Jun 13 19:15:42 pisces kernel: [27430.372061] kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.372063] ? process_
Jun 13 19:15:42 pisces kernel: [27430.372065] ? kthread_
Jun 13 19:15:42 pisces kernel: [27430.372067] ret_from_
...