Comment 0 for bug 1813881

Revision history for this message
overlord (lazamarius1) wrote :

2019-01-22T09:03:34.968028+00:00 localhost kernel: [80233.315906] INFO: task kworker/u30:3:5705 blocked for more than 120 seconds.
2019-01-22T09:03:34.968049+00:00 localhost kernel: [80233.321444] Tainted: P O 4.15.0-43-generic #46~16.04.1-Ubuntu
2019-01-22T09:03:34.980485+00:00 localhost kernel: [80233.327648] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2019-01-22T09:03:34.980519+00:00 localhost kernel: [80233.333902] kworker/u30:3 D 0 5705 2 0x80000000
2019-01-22T09:03:34.980521+00:00 localhost kernel: [80233.333909] Workqueue: events_unbound fsnotify_mark_destroy_workfn
2019-01-22T09:03:34.980522+00:00 localhost kernel: [80233.333910] Call Trace:
2019-01-22T09:03:34.980526+00:00 localhost kernel: [80233.333914] __schedule+0x3d6/0x8b0
2019-01-22T09:03:34.980527+00:00 localhost kernel: [80233.333918] schedule+0x36/0x80
2019-01-22T09:03:34.980528+00:00 localhost kernel: [80233.333920] schedule_timeout+0x1db/0x370
2019-01-22T09:03:34.980530+00:00 localhost kernel: [80233.333927] ? __enqueue_entity+0x5c/0x60
2019-01-22T09:03:34.980531+00:00 localhost kernel: [80233.333932] ? enqueue_entity+0x112/0x670
2019-01-22T09:03:34.980547+00:00 localhost kernel: [80233.333937] wait_for_completion+0xb4/0x140
2019-01-22T09:03:34.980554+00:00 localhost kernel: [80233.333939] ? wake_up_q+0x70/0x70
2019-01-22T09:03:34.980556+00:00 localhost kernel: [80233.333944] __synchronize_srcu.part.13+0x85/0xb0
2019-01-22T09:03:34.980557+00:00 localhost kernel: [80233.333947] ? trace_raw_output_rcu_utilization+0x50/0x50
2019-01-22T09:03:34.980558+00:00 localhost kernel: [80233.333950] synchronize_srcu+0xd3/0xe0
2019-01-22T09:03:34.980559+00:00 localhost kernel: [80233.333956] ? synchronize_srcu+0xd3/0xe0
2019-01-22T09:03:34.980560+00:00 localhost kernel: [80233.333962] fsnotify_mark_destroy_workfn+0x7c/0xe0
2019-01-22T09:03:34.980568+00:00 localhost kernel: [80233.333966] process_one_work+0x14d/0x410
2019-01-22T09:03:34.980570+00:00 localhost kernel: [80233.333968] worker_thread+0x22b/0x460
2019-01-22T09:03:34.980571+00:00 localhost kernel: [80233.333971] kthread+0x105/0x140
2019-01-22T09:03:34.980572+00:00 localhost kernel: [80233.333974] ? process_one_work+0x410/0x410
2019-01-22T09:03:34.980573+00:00 localhost kernel: [80233.333976] ? kthread_destroy_worker+0x50/0x50
2019-01-22T09:03:34.980574+00:00 localhost kernel: [80233.333979] ret_from_fork+0x35/0x40

The taint on the kernel is from zfs module.
Also there are other processes that reach the same state (D) like dockerd, systemd(init) ...

2019-01-22T09:03:34.949861+00:00 localhost kernel: [80233.299475] INFO: task dockerd:2809 blocked for more than 120 seconds.
2019-01-22T09:03:34.949863+00:00 localhost kernel: [80233.303136] Tainted: P O 4.15.0-43-generic #46~16.04.1-Ubuntu
2019-01-22T09:03:34.962084+00:00 localhost kernel: [80233.309016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2019-01-22T09:03:34.962114+00:00 localhost kernel: [80233.315513] dockerd D 0 2809 1 0x00000000
2019-01-22T09:03:34.962118+00:00 localhost kernel: [80233.315516] Call Trace:
2019-01-22T09:03:34.962120+00:00 localhost kernel: [80233.315521] __schedule+0x3d6/0x8b0
2019-01-22T09:03:34.962122+00:00 localhost kernel: [80233.315528] ? xen_smp_send_reschedule+0x10/0x20
2019-01-22T09:03:34.962137+00:00 localhost kernel: [80233.315532] schedule+0x36/0x80
2019-01-22T09:03:34.962139+00:00 localhost kernel: [80233.315535] schedule_timeout+0x1db/0x370
2019-01-22T09:03:34.962140+00:00 localhost kernel: [80233.315537] ? try_to_wake_up+0x59/0x4a0
2019-01-22T09:03:34.962164+00:00 localhost kernel: [80233.315539] wait_for_completion+0xb4/0x140
2019-01-22T09:03:34.962168+00:00 localhost kernel: [80233.315541] ? wake_up_q+0x70/0x70
2019-01-22T09:03:34.962170+00:00 localhost kernel: [80233.315547] flush_work+0x129/0x1e0
2019-01-22T09:03:34.962171+00:00 localhost kernel: [80233.315552] ? worker_detach_from_pool+0xb0/0xb0
2019-01-22T09:03:34.962186+00:00 localhost kernel: [80233.315555] flush_delayed_work+0x3f/0x50
2019-01-22T09:03:34.962194+00:00 localhost kernel: [80233.315559] fsnotify_wait_marks_destroyed+0x15/0x20
2019-01-22T09:03:34.962195+00:00 localhost kernel: [80233.315561] fsnotify_destroy_group+0x48/0xd0
2019-01-22T09:03:34.962196+00:00 localhost kernel: [80233.315563] inotify_release+0x1e/0x50
2019-01-22T09:03:34.962197+00:00 localhost kernel: [80233.315565] __fput+0xea/0x220
2019-01-22T09:03:34.962198+00:00 localhost kernel: [80233.315567] ____fput+0xe/0x10
2019-01-22T09:03:34.962200+00:00 localhost kernel: [80233.315569] task_work_run+0x8a/0xb0
2019-01-22T09:03:34.962202+00:00 localhost kernel: [80233.315571] exit_to_usermode_loop+0xc4/0xd0
2019-01-22T09:03:34.962203+00:00 localhost kernel: [80233.315573] do_syscall_64+0xf4/0x130
2019-01-22T09:03:34.962205+00:00 localhost kernel: [80233.315575] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The issues seems to be related to this fix: (https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1802021).
The issue is reproducing also on Ubuntu 16.04.5LTS with kernel version 4.14.0-43-generic.

I am opening this bug for better traceability of the bak-ported fix.