Comment 15 for bug 1891773

Revision history for this message
Bogdan Mart (winnie-ua) wrote :

Have same issue:

[ 2538.464378] INFO: task kcompactd0:27 blocked for more than 120 seconds.
[ 2538.466729] Not tainted 4.15.0-122-generic #124-Ubuntu
[ 2538.468652] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2538.471451] kcompactd0 D 0 27 2 0x80000000
[ 2538.471453] Call Trace:
[ 2538.471462] __schedule+0x24e/0x880
[ 2538.471464] schedule+0x2c/0x80
[ 2538.471466] io_schedule+0x16/0x40
[ 2538.471468] __lock_page+0xff/0x140
[ 2538.471472] ? page_cache_tree_insert+0xe0/0xe0
[ 2538.471476] migrate_pages+0x91f/0xb80
[ 2538.471479] ? __ClearPageMovable+0x10/0x10
[ 2538.471482] ? isolate_freepages_block+0x3b0/0x3b0
[ 2538.471484] compact_zone+0x681/0x950
[ 2538.471486] kcompactd_do_work+0xfe/0x2a0
[ 2538.471489] ? __switch_to_asm+0x35/0x70
[ 2538.471491] ? __switch_to_asm+0x41/0x70
[ 2538.471494] kcompactd+0x86/0x1c0
[ 2538.471495] ? kcompactd+0x86/0x1c0
[ 2538.471499] ? wait_woken+0x80/0x80
[ 2538.471502] kthread+0x121/0x140
[ 2538.471504] ? kcompactd_do_work+0x2a0/0x2a0
[ 2538.471506] ? kthread_create_worker_on_cpu+0x70/0x70
[ 2538.471508] ret_from_fork+0x35/0x40

Kernel: 4.15.120

few days before it was hanging inside btrfs:
[12688.280246] INFO: task btrfs-transacti:597 blocked for more than 120 seconds.
[12688.285589] Not tainted 4.15.0-122-generic #124-Ubuntu
[12688.292217] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12688.305038] btrfs-transacti D 0 597 2 0x80000000
[12688.305042] Call Trace:
[12688.305054] __schedule+0x24e/0x880
[12688.305058] ? bit_wait+0x60/0x60
[12688.305059] schedule+0x2c/0x80
[12688.305061] io_schedule+0x16/0x40
[12688.305063] bit_wait_io+0x11/0x60
[12688.305065] __wait_on_bit+0x4c/0x90
[12688.305067] ? bit_wait+0x60/0x60
[12688.305068] out_of_line_wait_on_bit+0x90/0xb0
[12688.305075] ? bit_waitqueue+0x40/0x40
[12688.305116] lock_extent_buffer_for_io+0x100/0x2a0 [btrfs]
[12688.305136] btree_write_cache_pages+0x1b8/0x420 [btrfs]
[12688.305153] btree_writepages+0x5d/0x70 [btrfs]
[12688.305156] do_writepages+0x4b/0xe0
[12688.305174] ? free_extent_state+0x48/0xc0 [btrfs]
[12688.305188] ? merge_state.part.47+0x44/0x130 [btrfs]
[12688.305191] __filemap_fdatawrite_range+0xcf/0x100
[12688.305193] ? __filemap_fdatawrite_range+0xcf/0x100
[12688.305195] filemap_fdatawrite_range+0x13/0x20
[12688.305211] btrfs_write_marked_extents+0x68/0x140 [btrfs]
[12688.305226] btrfs_write_and_wait_marked_extents.constprop.20+0x4f/0x90 [btrfs]
[12688.305241] btrfs_commit_transaction+0x696/0x910 [btrfs]
[12688.305255] ? btrfs_commit_transaction+0x696/0x910 [btrfs]
[12688.305268] ? start_transaction+0x191/0x430 [btrfs]
[12688.305283] transaction_kthread+0x18d/0x1b0 [btrfs]
[12688.305286] kthread+0x121/0x140
[12688.305300] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs]
[12688.305302] ? kthread_create_worker_on_cpu+0x70/0x70
[12688.305305] ret_from_fork+0x35/0x40
[12688.305320] INFO: task postgres:1121 blocked for more than 120 seconds.
[12688.311393] Not tainted 4.15.0-122-generic #124-Ubuntu
[12688.314307] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12688.326485] postgres D 0 1121 1101 0x00000000
[12688.326490] Call Trace:
[12688.326502] __schedule+0x24e/0x880
[12688.326509] ? lookup_fast+0xcc/0x320
[12688.326512] schedule+0x2c/0x80
[12688.326516] schedule_preempt_disabled+0xe/0x10
[12688.326518] __mutex_lock.isra.5+0x276/0x4e0
[12688.326522] ? find_get_pages_range_tag+0x151/0x270
[12688.326524] __mutex_lock_slowpath+0x13/0x20
[12688.326526] ? __mutex_lock_slowpath+0x13/0x20
[12688.326528] mutex_lock+0x2f/0x40
[12688.326583] btrfs_log_inode_parent+0x17a/0xa80 [btrfs]
[12688.326585] ? __filemap_fdatawait_range+0x95/0x170
[12688.326587] ? __filemap_fdatawrite_range+0x50/0x100
[12688.326606] ? wait_current_trans+0x33/0x110 [btrfs]
[12688.326608] ? _cond_resched+0x19/0x40
[12688.326623] ? join_transaction+0x27/0x420 [btrfs]
[12688.326641] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
[12688.326659] btrfs_sync_file+0x375/0x530 [btrfs]
[12688.326667] vfs_fsync_range+0x51/0xb0
[12688.326670] do_fsync+0x3d/0x70
[12688.326672] SyS_fsync+0x10/0x20
[12688.326677] do_syscall_64+0x73/0x130
[12688.326681] entry_SYSCALL_64_after_hwframe+0x41/0xa6
[12688.326683] RIP: 0033:0x7f2f790e5af1
[12688.326684] RSP: 002b:00007fffb2bbb6f8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[12688.326686] RAX: ffffffffffffffda RBX: 0000564356da18f0 RCX: 00007f2f790e5af1
[12688.326687] RDX: 00007f2f6e9d26a0 RSI: 000000000a00000e RDI: 0000000000000044
[12688.326688] RBP: 00007fffb2bbb710 R08: 00007fffb2bd2000 R09: 00000c80c0ae158d
[12688.326689] R10: 00007fffb2bbb700 R11: 0000000000000246 R12: 000000000a00000e
[12688.326690] R13: 0000000000000008 R14: 0000564356d93db0 R15: 0000564356d83bb8
[19487.948941] SGI XFS with ACLs, security attributes, realtime, no debug enabled

can upload kernel dump. Workload was postgres. It repeasts everyday. wasn't able to reproduce only when postgres autovacuum was working. I'm executing query to copy 130 GiB data from one table to another.