Message appears every 2 minutes for 18 minutes (then I guess it stops reporting) ; after that the disk are blocked, then NFS fails, then the whole service I'm running fails.
It happened twice with kernel 4.15.0-128-generic.
I updated to 4.15.0-137-generic while service was down, let's see how long it will work.
HWE kernel is the next step.
Just happened for the second time in 46 hours (and it never happened before):
kernel: [167836.884337] INFO: task kcompactd0:63 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message. 0x24e/0x880 0x16/0x40 page+0xff/ 0x140 tree_insert+ 0xe0/0xe0 pages+0x91f/ 0xb80 ble+0x10/ 0x10 freepages_ block+0x3b0/ 0x3b0 zone+0x681/ 0x950 do_work+ 0xfe/0x2a0 to_asm+ 0x35/0x70 to_asm+ 0x41/0x70 0x86/0x1c0 0x86/0x1c0 0x80/0x80 do_work+ 0x2a0/0x2a0 create_ worker_ on_cpu+ 0x70/0x70 fork+0x35/ 0x40
kernel: [167836.887341] Not tainted 4.15.0-128-generic #131-Ubuntu
kernel: [167836.889880] "echo 0 > /proc/sys/
kernel: [167836.893754] kcompactd0 D 0 63 2 0x80000000
kernel: [167836.893760] Call Trace:
kernel: [167836.894017] __schedule+
kernel: [167836.894031] schedule+0x2c/0x80
kernel: [167836.894034] io_schedule+
kernel: [167836.894160] __lock_
kernel: [167836.894197] ? page_cache_
kernel: [167836.894246] migrate_
kernel: [167836.894269] ? __ClearPageMova
kernel: [167836.894272] ? isolate_
kernel: [167836.894276] compact_
kernel: [167836.894279] kcompactd_
kernel: [167836.894282] ? __switch_
kernel: [167836.894284] ? __switch_
kernel: [167836.894288] kcompactd+
kernel: [167836.894292] ? kcompactd+
kernel: [167836.894395] ? wait_woken+
kernel: [167836.894445] kthread+0x121/0x140
kernel: [167836.894449] ? kcompactd_
kernel: [167836.894452] ? kthread_
kernel: [167836.894454] ret_from_
Message appears every 2 minutes for 18 minutes (then I guess it stops reporting) ; after that the disk are blocked, then NFS fails, then the whole service I'm running fails.
It happened twice with kernel 4.15.0-128-generic.
I updated to 4.15.0-137-generic while service was down, let's see how long it will work.
HWE kernel is the next step.