I hadn't used sshfs on Power machines before so there isn't a prior good version. Mainline v4.13 - jenkins-ppc64 steps on host: virsh start {guest} sshfs ozlabs@jenkins-ppc64:/home/ozlabs jenkins-ppc64/ -o reconnect,idmap=user cd jenkins-ppc64/linux make -j 400 ; # something with a lot of read on sshfs # on another terminal virsh suspend jenkins-ppc64 killall -9 cc (or whatever processes is reading) # resume guest root@p87:~# virsh resume jenkins-ppc64 Domain jenkins-ppc64 resumed root@p87:~# virsh list Id Name State ---------------------------------------------------- 1 jenkins running 2 jenkins-ppc64 running root@p87:~# virsh list Id Name State ---------------------------------------------------- 1 jenkins running 2 jenkins-ppc64 running root@p87:~# virsh list Id Name State ---------------------------------------------------- 1 jenkins running 2 jenkins-ppc64 running root@p87:~# virsh list Id Name State ---------------------------------------------------- 1 jenkins running 2 jenkins-ppc64 running root@p87:~# virsh console jenkins-ppc64 Connected to domain jenkins-ppc64 Escape character is ^] Shared connection to p87 closed. No further network connections to p87 could be established. I happened to have a FSP ipmi sol activate console open: (large amounts of tg3 0005:09:00.0 enP5p9s0f0: removed - appologies if I missed backtraces from specific CPUs). root@p87:~# Watchdog CPU:160 detected Hard LOCKUP other CPUS:40 watchdog: BUG: soft lockup - CPU#64 stuck for 23s! [ksmd:1695] watchdog: BUG: soft lockup - CPU#160 stuck for 22s! [qemu-system-ppc:13057] INFO: rcu_sched self-detected stall on CPU 64-...: (2600 ticks this GP) idle=8b2/140000000000001/0 softirq=40596/40596 fqs=1148 (t=2601 jiffies g=18364 c=18363 q=1057) INFO: rcu_sched detected stalls on CPUs/tasks: Watchdog CPU:24 detected Hard LOCKUP other CPUS:160 Watchdog CPU:160 Hard LOCKUP 40-...: (1 GPs behind) idle=4a2/140000000000000/0 softirq=40283/40285 fqs=1149 64-...: (2601 ticks this GP) idle=8b2/140000000000002/0 softirq=40596/40596 fqs=1149 Watchdog CPU:8 Hard LOCKUP Watchdog CPU:8 became unstuck Watchdog CPU:56 detected Hard LOCKUP other CPUS:112 Watchdog CPU:24 Hard LOCKUP Watchdog CPU:24 became unstuck rcu_sched kthread starved for 1074 jiffies! g18364 c18363 f0x0 RCU_GP_DOING_FQS(4) ->state=0x0 Watchdog CPU:112 became unstuck Watchdog CPU:8 detected Hard LOCKUP other CPUS:64 Watchdog CPU:64 Hard LOCKUP watchdog: BUG: soft lockup - CPU#64 stuck for 22s! [ksmd:1695] watchdog: BUG: soft lockup - CPU#144 stuck for 22s! [qemu-system-ppc:21857] INFO: rcu_sched self-detected stall on CPU 64-...: (9303 ticks this GP) idle=8b2/140000000000001/0 softirq=40596/40596 fqs=3608 (t=10406 jiffies g=18364 c=18363 q=5303) INFO: rcu_sched detected stalls on CPUs/tasks: 40-...: (1 GPs behind) idle=4a2/140000000000000/0 softirq=40283/40285 fqs=3608 64-...: (9303 ticks this GP) idle=8b2/140000000000001/0 softirq=40596/40596 fqs=3608 (detected by 8, t=11509 jiffies, g=18364, c=18363, q=5786) Watchdog CPU:24 detected Hard LOCKUP other CPUS:8 rcu_sched kthread starved for 2206 jiffies! g18364 c18363 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 Watchdog CPU:8 became unstuck sd 0:2:1:0: [sdb] tag#13 Resetting device watchdog: BUG: soft lockup - CPU#64 stuck for 24s! [ksmd:1695] ipr 0001:08:00.0: Timed out waiting for aborted commands sd 0:2:3:0: [sdd] tag#5 Resetting device INFO: rcu_sched self-detected stall on CPU 64-...: (16006 ticks this GP) idle=8b2/140000000000001/0 softirq=40596/40596 fqs=6187 (t=18211 jiffies g=18364 c=18363 q=9258) INFO: rcu_sched detected stalls on CPUs/tasks: ipr 0001:08:00.0: Timed out waiting for aborted commands sd 0:2:5:0: [sdf] tag#11 Resetting device tg3 0005:09:00.0 enP5p9s0f0: transmit timed out, resetting tg3 0005:09:00.0 enP5p9s0f0: 0x00000000: 0x165714e4, 0x00100546, 0x02000001, 0x00800000 tg3 0005:09:00.0 enP5p9s0f0: 0x00000010: 0x0000000c, 0x00002501, 0x0001000c, 0x00002501 Sending NMI from CPU 152 to CPUs 64: Watchdog CPU:32 Hard LOCKUP Modules linked in: fuse vhost_net vhost tap iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack overlay bridge stp llc binfmt_misc kvm_hv kvm vmx_crypto powernv_op_panel powernv_rng rng_core leds_powernv led_class autofs4 xfs btrfs lzo_compress raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath mlx4_en raid10 crc32c_vpmsum lpfc be2net crc_t10dif crct10dif_generic crct10dif_common mlx4_core irq event stamp: 3244340 hardirqs last enabled at (3244339): [] _raw_spin_unlock_irqrestore+0x50/0xd0 hardirqs last disabled at (3244340): [] __schedule+0x128/0x1050 softirqs last enabled at (3148130): [] __do_softirq+0x4e8/0x710 softirqs last disabled at (3148101): [] irq_exit+0x108/0x150 CPU: 32 PID: 9 Comm: rcu_sched Tainted: G W L 4.13.0 #1 task: c000001ff2411700 task.stack: c000001ff2498000 NIP: c0000000001bdae0 LR: c000000000154aa0 CTR: c000000000924a40 REGS: c00000003fe7fd80 TRAP: 0900 Tainted: G W L (4.13.0) MSR: 900000000280b033 CR: 28002242 XER: 20000000 CFAR: c000000000154a9c SOFTE: 0 GPR00: c000000000154ce8 c000001ff249b510 c000000001042100 c000001fffb26a70 GPR04: c000001fee4464a8 c000001fffb21d60 0000000292308e16 c000000001116680 GPR08: 0000000000000000 c000000001116680 c000001ff2498000 00000000000047bc GPR12: c000000000924a40 c00000000fd8a000 00000000000047bc 0000000000000000 GPR16: 0000000000000000 0000000000000003 0000000000000000 0000000000000001 GPR20: c000001fffb26000 c000001ff2411700 000000000000ba7e c000000000f8eb20 GPR24: 000000000000ba7e 0000000000000001 0000000000000000 c000001fffb260a0 GPR28: c000001fffb26000 0000000000000001 c000001ff2411800 c000001fffb260a0 NIP [c0000000001bdae0] hrtimer_active+0x0/0x90 LR [c000000000154aa0] task_tick_fair+0x350/0x6d0 Call Trace: [c000001ff249b510] [c000000000154ce8] task_tick_fair+0x598/0x6d0 (unreliable) [c000001ff249b5f0] [c000000000140c4c] scheduler_tick+0xac/0x1b0 [c000001ff249b650] [c0000000001bd60c] update_process_times+0x5c/0x90 [c000001ff249b680] [c0000000001d732c] tick_sched_handle.isra.5+0x2c/0xc0 [c000001ff249b6b0] [c0000000001d7418] tick_sched_timer+0x58/0xd0 [c000001ff249b6f0] [c0000000001be674] __hrtimer_run_queues+0x154/0x790 [c000001ff249b780] [c0000000001bf9c0] hrtimer_interrupt+0xe0/0x330 [c000001ff249b850] [c000000000026e50] __timer_interrupt+0xb0/0x520 [c000001ff249b8b0] [c0000000000277b0] timer_interrupt+0x90/0xe0 [c000001ff249b8e0] [c000000000009280] decrementer_common+0x160/0x170 --- interrupt: 901 at .L142+0x0/0x4 LR = arch_local_irq_restore.part.5+0xa8/0xc0 [c000001ff249bbd0] [0000000000000001] 0x1 (unreliable) [c000001ff249bbf0] [c000000000b2312c] _raw_spin_unlock_irqrestore+0x5c/0xd0 [c000001ff249bc20] [c0000000001afb8c] force_qs_rnp+0x21c/0x240 [c000001ff249bca0] [c0000000001b0488] rcu_gp_kthread+0x8d8/0x1bc0 [c000001ff249bdc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001ff249be30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 3920ffff 79290040 7d234b78 4e800020 3c4c00e8 38424640 3d22ff18 f8830040 3929cbb0 f9230028 4e800020 60420000 e92a0000 81090038 710a0001 Watchdog CPU:32 became unstuck NMI backtrace for cpu 64 CPU: 64 PID: 1695 Comm: ksmd Tainted: G W L 4.13.0 #1 task: c000001feaf8ba00 task.stack: c000001fe9010000 NIP: c0000000001deb08 LR: c0000000001deac4 CTR: c00000000008ebd0 REGS: c000001fe90134b0 TRAP: 0501 Tainted: G W L (4.13.0) MSR: 9000000000009033 CR: 44444424 XER: 00000000 CFAR: c0000000001deb10 SOFTE: 1 GPR00: c0000000001dea98 c000001fe9013730 c000000001042100 0000000000000028 GPR04: 0000000000000028 0000000000000028 0000000000000000 0000000000000004 GPR08: c000000001083bf8 0000000000000001 c000001fffd29d98 c000003fff727080 GPR12: c00000000008ebd0 c00000000fd94000 NIP [c0000000001deb08] smp_call_function_many+0x398/0x460 LR [c0000000001deac4] smp_call_function_many+0x354/0x460 Call Trace: [c000001fe9013730] [c0000000001dea98] smp_call_function_many+0x328/0x460 (unreliable) [c000001fe90137a0] [c0000000001dec1c] smp_call_function+0x4c/0x70 [c000001fe90137d0] [c0000000000711c4] pmdp_invalidate+0x74/0xb0 [c000001fe9013800] [c0000000003717f0] __split_huge_pmd+0x6f0/0xcc0 [c000001fe90138c0] [c000000000329b24] try_to_unmap_one+0x6d4/0x830 [c000001fe90139a0] [c0000000003282f4] rmap_walk_anon+0x164/0x3b0 [c000001fe9013a10] [c00000000032b444] try_to_unmap+0xa4/0x160 [c000001fe9013a70] [c000000000373a4c] split_huge_page_to_list+0x18c/0xbb0 [c000001fe9013b30] [c000000000352b2c] try_to_merge_one_page+0x2ac/0xa70 [c000001fe9013c40] [c00000000035335c] try_to_merge_with_ksm_page+0x6c/0xf0 [c000001fe9013c90] [c000000000354a70] ksm_scan_thread+0x9c0/0x1af0 [c000001fe9013dc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001fe9013e30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 3d020004 39081af8 78691f24 e95e0000 7d28482a 7d4a4a14 812a0018 71290001 4182001c 60420000 7c210b78 7c421378 <812a0018> 71290001 4082fff0 7c2004ac rcu_sched kthread starved for 1093 jiffies! g18364 c18363 f0x0 RCU_GP_DOING_FQS(4) ->state=0x0 rcu_sched S 8960 9 2 0x00000800 Call Trace: [c000001ff249b850] [c000001ff249b930] 0xc000001ff249b930 (unreliable) [c000001ff249ba20] [c00000000001f258] __switch_to+0x278/0x4a0 [c000001ff249ba80] [c000000000b1a97c] __schedule+0x3dc/0x1050 [c000001ff249bb60] [c000000000b1b63c] schedule+0x4c/0xe0 [c000001ff249bb90] [c000000000b219a8] schedule_timeout+0xa8/0x5e0 [c000001ff249bca0] [c0000000001b06c8] rcu_gp_kthread+0xb18/0x1bc0 [c000001ff249bdc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001ff249be30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Watchdog CPU:152 became unstuck ... tg3 0005:09:00.0 enP5p9s0f0: 3: NAPI info [0000003e:0000003e:(0000:0000:01ff):0ab0:(02b0:02b0:0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: 4: Host status block [00000001:000000b1:(0000:0000:0528):(0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: 4: NAPI info [000000a5:000000a5:(0000:0000:01ff):051c:(051c:051c:0000:0000)] systemd[1]: systemd-journald.service: Processes still around after SIGKILL. Ignoring. INFO: rcu_sched self-detected stall on CPU 64-...: (62926 ticks this GP) idle=8b2/140000000000001/0 softirq=40596/40596 fqs=22240 (t=72846 jiffies g=18364 c=18363 q=44508) Sending NMI from CPU 64 to CPUs 40: INFO: rcu_sched detected stalls on CPUs/tasks: systemd[1]: systemd-udevd.service: Processes still around after SIGKILL. Ignoring. Watchdog CPU:56 Hard LOCKUP Modules linked in: fuse vhost_net vhost tap iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack overlay bridge stp llc binfmt_misc kvm_hv kvm vmx_crypto powernv_op_panel powernv_rng rng_core leds_powernv led_class autofs4 xfs btrfs lzo_compress raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath mlx4_en raid10 crc32c_vpmsum lpfc be2net crc_t10dif crct10dif_generic crct10dif_common mlx4_core irq event stamp: 24447578 hardirqs last enabled at (24447578): [] _raw_spin_unlock_irqrestore+0x50/0xd0 hardirqs last disabled at (24447577): [] _raw_spin_lock_irqsave+0x34/0xa0 softirqs last enabled at (24442107): [] __do_softirq+0x4e8/0x710 softirqs last disabled at (24442100): [] irq_exit+0x108/0x150 CPU: 56 PID: 13007 Comm: qemu-system-ppc Tainted: G W L 4.13.0 #1 task: c000003efc71a280 task.stack: c000003efc7a4000 NIP: c0000000001831f4 LR: c000000000b22ea4 CTR: c0000000007d25a0 REGS: c00000003fd5fd80 TRAP: 0900 Tainted: G W L (4.13.0) MSR: 9000000000009033 CR: 28022224 XER: 20000000 CFAR: c00000000018320c SOFTE: 0 GPR00: c000000000b22e98 c000003efc7a70d0 c000000001042100 c000000000ef6b80 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 GPR08: 0000000000000000 0000000080000040 0000000080000038 9000000000001003 GPR12: 0000000000002200 c00000000fd91800 0000000000000000 c000000000d1b8c0 GPR16: c000000000d1b8f0 c000000000d1b928 c000000000ef6f80 c000000000ef5f80 GPR20: c000000000ced068 c000000000d1b890 c000000000ebee48 c000000000f35f80 GPR24: c000000001087d50 c000000001088210 c000000001088208 0000003ffe650000 GPR28: c000000000f38080 c000000000ed6b80 0000000000000000 c000000000ef6b80 NIP [c0000000001831f4] do_raw_spin_lock+0x94/0x210 LR [c000000000b22ea4] _raw_spin_lock_irqsave+0x74/0xa0 Call Trace: [c000003efc7a70d0] [c000000000f38080] rcu_struct_flavors+0x0/0x10 (unreliable) [c000003efc7a7100] [c000000000b22e98] _raw_spin_lock_irqsave+0x68/0xa0 [c000003efc7a7140] [c0000000001b2f38] rcu_check_callbacks+0xa58/0xd30 [c000003efc7a7280] [c0000000001bd5ec] update_process_times+0x3c/0x90 [c000003efc7a72b0] [c0000000001d732c] tick_sched_handle.isra.5+0x2c/0xc0 [c000003efc7a72e0] [c0000000001d7418] tick_sched_timer+0x58/0xd0 [c000003efc7a7320] [c0000000001be674] __hrtimer_run_queues+0x154/0x790 [c000003efc7a73b0] [c0000000001bf9c0] hrtimer_interrupt+0xe0/0x330 [c000003efc7a7480] [c000000000026e50] __timer_interrupt+0xb0/0x520 [c000003efc7a74e0] [c0000000000277b0] timer_interrupt+0x90/0xe0 [c000003efc7a7510] [c000000000009280] decrementer_common+0x160/0x170 --- interrupt: 901 at .L142+0x0/0x4 LR = arch_local_irq_restore.part.5+0xa8/0xc0 [c000003efc7a7800] [0000000000000038] 0x38 (unreliable) [c000003efc7a7820] [d0000000182980e0] kvmppc_run_core+0x1028/0x2140 [kvm_hv] [c000003efc7a79d0] [d00000001829a028] kvmppc_vcpu_run_hv+0x3a0/0x1ea0 [kvm_hv] [c000003efc7a7b10] [d000000017fa62d4] kvmppc_vcpu_run+0x2c/0x48 [kvm] [c000003efc7a7b30] [d000000017fa2a30] kvm_arch_vcpu_ioctl_run+0x108/0x320 [kvm] [c000003efc7a7bd0] [d000000017f953bc] kvm_vcpu_ioctl+0x414/0x8f8 [kvm] [c000003efc7a7d40] [c0000000003b922c] do_vfs_ioctl+0xcc/0xa80 [c000003efc7a7de0] [c0000000003b9c40] SyS_ioctl+0x60/0x100 [c000003efc7a7e30] [c00000000000b96c] system_call+0x58/0x6c Instruction dump: 40c2fff0 7c2004ac 2fa90000 409e0020 a12d0008 913f0008 e92d0250 f93f0010 38210030 ebe1fff8 4e800020 7c210b78 89290009 71290002 408200f0 NMI backtrace for cpu 64 40-...: (1 GPs behind) idle=4a2/140000000000000/0 softirq=40283/40285 fqs=22241 Watchdog CPU:0 Hard LOCKUP Modules linked in: fuse vhost_net vhost tap iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack overlay bridge stp llc binfmt_misc kvm_hv kvm vmx_crypto powernv_op_panel powernv_rng rng_core leds_powernv led_class autofs4 xfs btrfs lzo_compress raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath mlx4_en raid10 crc32c_vpmsum lpfc be2net crc_t10dif crct10dif_generic crct10dif_common mlx4_core irq event stamp: 3244340 hardirqs last enabled at (3244339): [] _raw_spin_unlock_irqrestore+0x50/0xd0 hardirqs last disabled at (3244340): [] __schedule+0x128/0x1050 softirqs last enabled at (3148130): [] __do_softirq+0x4e8/0x710 softirqs last disabled at (3148101): [] irq_exit+0x108/0x150 CPU: 0 PID: 9 Comm: rcu_sched Tainted: G W L 4.13.0 #1 task: c000001ff2411700 task.stack: c000001ff2498000 NIP: c00000000014f680 LR: c000000000140c58 CTR: c000000000924a40 REGS: c00000003ffffd80 TRAP: 0900 Tainted: G W L (4.13.0) MSR: 9000000000009033 CR: 28000242 XER: 20000000 CFAR: c000000000155348 SOFTE: 0 GPR00: c000000000140c58 c000001ff249b5f0 c000000001042100 c000001fff326000 GPR04: 0000000000000400 0000000000000001 00000002a5d78cbc c000000001116680 GPR08: c000000001083bf8 000000000002d348 0000001ffe450000 00000000000047bc GPR12: 0000000000000000 c00000000fd80000 00000000000047bc 0000000000000000 GPR16: 0000000000000000 0000000000000003 0000000000000004 0000000000000000 GPR20: c000000000d1b1a8 c000001fff311140 0000000000000001 00000848af56c7bb GPR24: c000001fff311098 0000000000000000 c000001ff2411700 c000001fff326018 GPR28: c000000001083bf8 0000000000000000 c000000000ed6000 c000001fff326000 NIP [c00000000014f680] cpu_load_update+0x10/0x200 LR [c000000000140c58] scheduler_tick+0xb8/0x1b0 Call Trace: [c000001ff249b5f0] [c000000000140c58] scheduler_tick+0xb8/0x1b0 (unreliable) [c000001ff249b650] [c0000000001bd60c] update_process_times+0x5c/0x90 [c000001ff249b680] [c0000000001d732c] tick_sched_handle.isra.5+0x2c/0xc0 [c000001ff249b6b0] [c0000000001d7418] tick_sched_timer+0x58/0xd0 [c000001ff249b6f0] [c0000000001be674] __hrtimer_run_queues+0x154/0x790 [c000001ff249b780] [c0000000001bf9c0] hrtimer_interrupt+0xe0/0x330 [c000001ff249b850] [c000000000026e50] __timer_interrupt+0xb0/0x520 [c000001ff249b8b0] [c0000000000277b0] timer_interrupt+0x90/0xe0 [c000001ff249b8e0] [c000000000009280] decrementer_common+0x160/0x170 --- interrupt: 901 at .L142+0x0/0x4 LR = arch_local_irq_restore.part.5+0xa8/0xc0 [c000001ff249bbd0] [0000000000000001] 0x1 (unreliable) [c000001ff249bbf0] [c000000000b2312c] _raw_spin_unlock_irqrestore+0x5c/0xd0 [c000001ff249bc20] [c0000000001afb8c] force_qs_rnp+0x21c/0x240 [c000001ff249bca0] [c0000000001b0488] rcu_gp_kthread+0x8d8/0x1bc0 [c000001ff249bdc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001ff249be30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 7ce041ad 40c2fff4 4e800020 60420000 3c4c00ef 38422aa0 4bff77b0 60420000 3c4c00ef 38422a90 7c0802a6 e9230090 <7c6c1b78> fb41ffd0 fb61ffd8 fb81ffe0 tg3 0005:09:00.0 enP5p9s0f0: transmit timed out, resetting NMI backtrace for cpu 64 CPU: 64 PID: 1695 Comm: ksmd Tainted: G W L 4.13.0 #1 task: c000001feaf8ba00 task.stack: c000001fe9010000 NIP: c0000000001deb08 LR: c0000000001deac4 CTR: c00000000008ebd0 REGS: c000001fe90134b0 TRAP: 0501 Tainted: G W L (4.13.0) MSR: 9000000000009033 CR: 44444424 XER: 00000000 CFAR: c0000000001deb10 SOFTE: 1 GPR00: c0000000001dea98 c000001fe9013730 c000000001042100 0000000000000028 GPR04: 0000000000000028 0000000000000028 0000000000000000 0000000000000004 GPR08: c000000001083bf8 0000000000000001 c000001fffd29d98 c000003fff727080 GPR12: c00000000008ebd0 c00000000fd94000 NIP [c0000000001deb08] smp_call_function_many+0x398/0x460 LR [c0000000001deac4] smp_call_function_many+0x354/0x460 Call Trace: [c000001fe9013730] [c0000000001dea98] smp_call_function_many+0x328/0x460 (unreliable) [c000001fe90137a0] [c0000000001dec1c] smp_call_function+0x4c/0x70 [c000001fe90137d0] [c0000000000711c4] pmdp_invalidate+0x74/0xb0 [c000001fe9013800] [c0000000003717f0] __split_huge_pmd+0x6f0/0xcc0 [c000001fe90138c0] [c000000000329b24] try_to_unmap_one+0x6d4/0x830 [c000001fe90139a0] [c0000000003282f4] rmap_walk_anon+0x164/0x3b0 [c000001fe9013a10] [c00000000032b444] try_to_unmap+0xa4/0x160 [c000001fe9013a70] [c000000000373a4c] split_huge_page_to_list+0x18c/0xbb0 [c000001fe9013b30] [c000000000352b2c] try_to_merge_one_page+0x2ac/0xa70 [c000001fe9013c40] [c00000000035335c] try_to_merge_with_ksm_page+0x6c/0xf0 [c000001fe9013c90] [c000000000354a70] ksm_scan_thread+0x9c0/0x1af0 [c000001fe9013dc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001fe9013e30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 3d020004 39081af8 78691f24 e95e0000 7d28482a 7d4a4a14 812a0018 71290001 4182001c 60420000 7c210b78 7c421378 <812a0018> 71290001 4082fff0 7c2004ac rcu_sched kthread starved for 1095 jiffies! g18364 c18363 f0x0 RCU_GP_DOING_FQS(4) ->state=0x0 rcu_sched R running task 8960 9 2 0x00000804 Call Trace: Watchdog CPU:16 became unstuck systemd[1]: systemd-logind.service: Processes still around after SIGKILL. Ignoring. tg3 0005:09:00.0 enP5p9s0f0: 0x00000000: 0x165714e4, 0x00100546, 0x02000001, 0x00800000 tg3 0005:09:00.0 enP5p9s0f0: 0x00000010: 0x0000000c, 0x00002501, 0x0001000c, 0x00002501 tg3 0005:09:00.0 enP5p9s0f0: 0x00000020: 0x0002000c, 0x00002501, 0x00000000, 0x04201014 tg3 0005:09:00.0 enP5p9s0f0: 0x00000030: 0x00000000, 0x00000048, 0x00000000, 0x00000100 tg3 0005:09:00.0 enP5p9s0f0: 0x00000040: 0x00000000, 0xc5000000, 0xc8035001, 0x64002008 ..... tg3 0005:09:00.0 enP5p9s0f0: 3: Host status block [00000001:00000050:(0000:0000:0000):(0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: 3: NAPI info [0000003e:0000003e:(0000:0000:01ff):0ab0:(02b0:02b0:0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: 4: Host status block [00000001:000000b1:(0000:0000:0528):(0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: 4: NAPI info [000000a5:000000a5:(0000:0000:01ff):051c:(051c:051c:0000:0000)] tg3 0005:09:00.0 enP5p9s0f0: transmit timed out, resetting watchdog: BUG: soft lockup - CPU#64 stuck for 23s! [ksmd:1695] Modules linked in: fuse vhost_net vhost tap iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack overlay bridge stp llc binfmt_misc kvm_hv kvm vmx_crypto powernv_op_panel powernv_rng rng_core leds_powernv led_class autofs4 xfs btrfs lzo_compress raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath mlx4_en raid10 crc32c_vpmsum lpfc be2net crc_t10dif crct10dif_generic crct10dif_common mlx4_core irq event stamp: 822355 hardirqs last enabled at (822355): [] fast_exc_return_irq+0x28/0x34 hardirqs last disabled at (822354): [] decrementer_common+0x148/0x170 softirqs last enabled at (779544): [] __do_softirq+0x4e8/0x710 softirqs last disabled at (779529): [] irq_exit+0x108/0x150 CPU: 64 PID: 1695 Comm: ksmd Tainted: G W L 4.13.0 #1 task: c000001feaf8ba00 task.stack: c000001fe9010000 NIP: c0000000001deb08 LR: c0000000001deac4 CTR: c00000000008ebd0 REGS: c000001fe90134b0 TRAP: 0901 Tainted: G W L (4.13.0) MSR: 9000000000009033 CR: 44444424 XER: 00000000 CFAR: c0000000001deb10 SOFTE: 1 GPR00: c0000000001dea98 c000001fe9013730 c000000001042100 0000000000000028 GPR04: 0000000000000028 0000000000000028 0000000000000000 0000000000000004 GPR08: c000000001083bf8 0000000000000001 c000001fffd29d98 c000003fff727080 GPR12: c00000000008ebd0 c00000000fd94000 NIP [c0000000001deb08] smp_call_function_many+0x398/0x460 LR [c0000000001deac4] smp_call_function_many+0x354/0x460 Call Trace: [c000001fe9013730] [c0000000001dea98] smp_call_function_many+0x328/0x460 (unreliable) [c000001fe90137a0] [c0000000001dec1c] smp_call_function+0x4c/0x70 [c000001fe90137d0] [c0000000000711c4] pmdp_invalidate+0x74/0xb0 [c000001fe9013800] [c0000000003717f0] __split_huge_pmd+0x6f0/0xcc0 [c000001fe90138c0] [c000000000329b24] try_to_unmap_one+0x6d4/0x830 [c000001fe90139a0] [c0000000003282f4] rmap_walk_anon+0x164/0x3b0 [c000001fe9013a10] [c00000000032b444] try_to_unmap+0xa4/0x160 [c000001fe9013a70] [c000000000373a4c] split_huge_page_to_list+0x18c/0xbb0 [c000001fe9013b30] [c000000000352b2c] try_to_merge_one_page+0x2ac/0xa70 [c000001fe9013c40] [c00000000035335c] try_to_merge_with_ksm_page+0x6c/0xf0 [c000001fe9013c90] [c000000000354a70] ksm_scan_thread+0x9c0/0x1af0 [c000001fe9013dc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001fe9013e30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 3d020004 39081af8 78691f24 e95e0000 7d28482a 7d4a4a14 812a0018 71290001 4182001c 60420000 7c210b78 7c421378 <812a0018> 71290001 4082fff0 7c2004ac tg3 0005:09:00.0 enP5p9s0f0: 0x00000000: 0x165714e4, 0x00100546, 0x02000001, 0x00800000 tg3 0005:09:00.0 enP5p9s0f0: 0x00000010: 0x0000000c, 0x00002501, 0x0001000c, 0x00002501 tg3 0005:09:00.0 enP5p9s0f0: 0x00000020: 0x0002000c, 0x00002501, 0x00000000, 0x04201014