Ubuntu18.04:pKVM - Host in hung state and out of network after few hours of stress run on all guests

Bug #1757402 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Canonical Kernel Team
Artful
Won't Fix
High
Canonical Kernel Team
Bionic
Fix Released
Critical
Canonical Kernel Team

Bug Description

== Comment: #0 - INDIRA P. JOGA <email address hidden> - 2018-02-11 12:37:25 ==
Problem Description:
===================
After few hours of run system is in hung state with, "rcu_sched detected stalls on CPUs/tasks" messages on the host IPMI console and host is out of network .

Steps to re-create:
==================
> Installed Ubuntu1804 on boslcp3 host.

root@boslcp3:~# uname -a
Linux boslcp3 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
root@boslcp3:~# uname -r
4.13.0-25-generic

> root@boslcp3:~# ppc64_cpu --smt
SMT is off

> Hugepage set up

echo 8500 > /proc/sys/vm/nr_hugepages

> Defined the guests from host machine

Id Name State
----------------------------------------------------
 2 boslcp3g2 shut off
 3 boslcp3g3 shut off
 4 boslcp3g4 shut off
 6 boslcp3g1 shut off
 7 boslcp3g5 shut off

> Started and installed ubuntu1804 daily build on all the guests.

root@boslcp3:~# virsh list --all
 Id Name State
----------------------------------------------------
 2 boslcp3g2 running
 3 boslcp3g3 running
 4 boslcp3g4 running
 6 boslcp3g1 running
 7 boslcp3g5 running

> Started regression run (IO_BASE_TCP_NFS) tests on all 5 guests.
NOTE: Removed madvise test case from BASE focus areas.

> Run went fine for few hours on all guests.

> After few hours of run ,Host system is in hung state and host console dumps CPU stall messages as below

[SOL Session operational. Use ~? for help]
[250867.133429] INFO: rcu_sched detected stalls on CPUs/tasks:
[250867.133499] (detected by 86, t=62711832 jiffies, g=497, c=496, q=31987857)
[250867.133554] All QSes seen, last rcu_sched kthread activity 62711828 (4357609080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0
[250867.133690] rcu_sched kthread starved for 62711828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100
[250931.133433] INFO: rcu_sched detected stalls on CPUs/tasks:
[250931.133494] (detected by 3, t=62727832 jiffies, g=497, c=496, q=31995625)
[250931.133572] All QSes seen, last rcu_sched kthread activity 62727828 (4357625080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0
[250931.133741] rcu_sched kthread starved for 62727828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100
[250995.133432] INFO: rcu_sched detected stalls on CPUs/tasks:
[250995.133480] (detected by 54, t=62743832 jiffies, g=497, c=496, q=32004479)
[250995.133526] All QSes seen, last rcu_sched kthread activity 62743828 (4357641080-4294897252), jiffies_till_next_fqs=1, root ->qsmask 0x0
[250995.133645] rcu_sched kthread starved for 62743828 jiffies! g497 c496 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100

> Not able to get the prompt

> Ping /shh to boslcp3 also fails

[ipjoga@kte ~]$ ping boslcp3
PING boslcp3.isst.aus.stglabs.ibm.com (10.33.0.157) 56(84) bytes of data.
From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=1 Destination Host Unreachable
From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=2 Destination Host Unreachable
From kte.isst.aus.stglabs.ibm.com (10.33.11.31) icmp_seq=3 Destination Host Unreachable

[ipjoga@kte ~]$ ssh root@boslcp3
ssh: connect to host boslcp3 port 22: No route to host

> boslcp3 is not reachable

> Attached boslcp3 host console logs

== Comment: #1 - INDIRA P. JOGA <email address hidden> - 2018-02-11 12:39:29 ==
Added Host console logs

== Comment: #24 - VIPIN K. PARASHAR <email address hidden> - 2018-02-16 05:46:13 ==
From Linux logs
===========

[72072.290071] watchdog: BUG: soft lockup - CPU#132 stuck for 22s! [CPU 12/KVM:15579]
[72072.290218] CPU: 132 PID: 15579 Comm: CPU 12/KVM Tainted: G W L 4.13.0-32-generic #35-Ubuntu
[72072.290220] task: c000200debf82e00 task.stack: c000200e140f8000
[72072.290221] NIP: c000000000c779e0 LR: c0080000166893a0 CTR: c000000000c77980
[72072.290223] REGS: c000200e140fb790 TRAP: 0901 Tainted: G W L (4.13.0-32-generic)
[72072.290224] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
[72072.290235] CR: 28024224 XER: 00000000
[72072.290236] CFAR: c000000000c779fc SOFTE: 1
[72072.290256] NIP [c000000000c779e0] _raw_spin_lock+0x60/0xe0
[72072.290261] LR [c0080000166893a0] kvmppc_pseries_do_hcall+0x548/0x8d0 [kvm_hv]
[72072.290262] Call Trace:
[72072.290263] [c000200e140fba10] [00000000ffffffff] 0xffffffff (unreliable)
[72072.290270] [c000200e140fba40] [c008000016689334] kvmppc_pseries_do_hcall+0x4dc/0x8d0 [kvm_hv]
[72072.290276] [c000200e140fbaa0] [c00800001668ab0c] kvmppc_vcpu_run_hv+0x1d4/0x470 [kvm_hv]
[72072.290287] [c000200e140fbb10] [c0080000162559cc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[72072.290299] [c000200e140fbb30] [c008000016251a80] kvm_arch_vcpu_ioctl_run+0x108/0x320 [kvm]
[72072.290309] [c000200e140fbbd0] [c008000016245018] kvm_vcpu_ioctl+0x400/0x7c8 [kvm]
[72072.290313] [c000200e140fbd40] [c0000000003c1a64] do_vfs_ioctl+0xd4/0xa00
[72072.290316] [c000200e140fbde0] [c0000000003c2454] SyS_ioctl+0xc4/0x130
[72072.290320] [c000200e140fbe30] [c00000000000b184] system_call+0x58/0x6c
[72072.290321] Instruction dump:
[72072.290323] 40c20010 7d40192d 40c2fff0 7c2004ac 2fa90000 409e001c 38210030 ebe1fff8
[72072.290331] 4e800020 60000000 60000000 60420000 <7c210b78> e92d0000 89290009 71290002
[72084.110070] watchdog: BUG: soft lockup - CPU#80 stuck for 23s! [CPU 11/KVM:15578]
[72084.110223] CPU: 80 PID: 15578 Comm: CPU 11/KVM Tainted: G W L 4.13.0-32-generic #35-Ubuntu
[72084.110225] task: c000200debfa3f00 task.stack: c000200e14080000
[72084.110226] NIP: c000000000c779e4 LR: c0080000166893a0 CTR: c000000000c77980
[72084.110229] REGS: c000200e14083790 TRAP: 0901 Tainted: G W L (4.13.0-32-generic)
[72084.110230] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
[72084.110240] CR: 28024824 XER: 00000000
[72084.110241] CFAR: c000000000c779fc SOFTE: 1
[72084.110260] NIP [c000000000c779e4] _raw_spin_lock+0x64/0xe0
[72084.110266] LR [c0080000166893a0] kvmppc_pseries_do_hcall+0x548/0x8d0 [kvm_hv]
[72084.110267] Call Trace:
[72084.110269] [c000200e14083a10] [00000000ffffffff] 0xffffffff (unreliable)
[72084.110275] [c000200e14083a40] [c008000016689334] kvmppc_pseries_do_hcall+0x4dc/0x8d0 [kvm_hv]
[72084.110280] [c000200e14083aa0] [c00800001668ab0c] kvmppc_vcpu_run_hv+0x1d4/0x470 [kvm_hv]
[72084.110292] [c000200e14083b10] [c0080000162559cc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[72084.110303] [c000200e14083b30] [c008000016251a80] kvm_arch_vcpu_ioctl_run+0x108/0x320 [kvm]
[72084.110314] [c000200e14083bd0] [c008000016245018] kvm_vcpu_ioctl+0x400/0x7c8 [kvm]
[72084.110318] [c000200e14083d40] [c0000000003c1a64] do_vfs_ioctl+0xd4/0xa00
[72084.110321] [c000200e14083de0] [c0000000003c2454] SyS_ioctl+0xc4/0x130
[72084.110325] [c000200e14083e30] [c00000000000b184] system_call+0x58/0x6c
[72084.110326] Instruction dump:
[72084.110328] 7d40192d 40c2fff0 7c2004ac 2fa90000 409e001c 38210030 ebe1fff8 4e800020
[72084.110335] 60000000 60000000 60420000 7c210b78 <e92d0000> 89290009 71290002 40820050

From kernel logs, i see that most of CPUs reporting soft-lockup are busy executing
kvmppc_pseries_do_hcall() and are waiting to acquire spin_lock.

From xmon, i also examined all CPUs and most of them were either busy executing
kvmppc_pseries_do_hcall() or kvmppc_run_core(). Tried triggering crash dump via sysrq keys
but it didn't complete. Soft-lockup got logged upon reboot post crash and OS never made back
to login prompt. Leaving machine is same state.

== Comment: #38 - Paul Mackerras <email address hidden> - 2018-03-01 00:11:04 ==
I looked at this system today. It was in a state where characters typed on the console would echo but it did not otherwise respond. Fortunately, ^O x was able to get me into xmon.

CPU 0x54 is in _raw_spin_lock called from kvmppc_pseries_do_hcall, spinning on a lock held by cpu 0x60. CPU 0x60 is also in _raw_spin_lock called from kvmppc_pseries_do_hcall, spinning on the same lock. In other words, CPU 0x60 is trying to acquire a lock it already holds. I will need to disassemble the kvmppc_pseries_do_hcall function to work out where it is in the source code and try to work out how it got to be trying to acquire a lock it already holds. I will need the exact kernel binary image for this. I will try to find it but if anyone could send it to me or point me to it that would be helpful.

I have left the system in xmon for now.

== Comment: #40 - Paul Mackerras <email address hidden> - 2018-03-02 05:39:03 ==
CPU 0x60 was stuck in the spin_lock call inside kvm_arch_vcpu_yield_to(). Somehow it has previously taken the vcore lock and failed to release it. I could add some debugging code to try to find out where it was last taken. I don't need the machine kept in the failed state any more.

Did you have the indep_threads_mode parameter set to N on this machine when the problem occurred?

== Comment: #45 - INDIRA P. JOGA <email address hidden> - 2018-03-06 01:33:26 ==

From host machine, while tests are running on guests

root@boslcp3:~# cat /sys/module/kvm_hv/parameters/indep_threads_mode
Y
root@boslcp3:

== Comment: #47 - Paul Mackerras <email address hidden> - 2018-03-07 01:10:42 ==
This patch fixes a bug in the assembly code in the KVM guest exit path which could cause the trap number in r12 to get overwritten. The effect of this would be that if the exit is due to a doorbell interrupt from another CPU, we would effectively discard the doorbell interrupt and not handle it after returning to host context. I think this is why we are seeing CPUs locked up in smp_call_function_many() - their IPI to a CPU got ignored if the CPU was in a KVM guest, hence that other CPU never ran the requested function, hence the CPU doing smp_call_function_many() would wait forever. In many cases that CPU was holding spinlocks and then other CPUs would wait forever in raw_spin_lock().

== Comment: #116 - INDIRA P. JOGA <email address hidden> - 2018-03-21 04:42:09 ==
(In reply to comment #115)
> (In reply to comment #113)
> > After 2.2 upgrade, run completed 62 hours with 2 guests running. It looks
> > the system is stable now.
>
> Hi Indira/Chethan,
>
> Can you please let know test run final status ?

Hi Vipin,

After 2.2 upgrade , 2 guests completed 72 hours of run. Host system is doing fine along with guests without crash/hang.

Regards,
Indira

== Comment: #119 - VIPIN K. PARASHAR <email address hidden> - 2018-03-21 05:33:19 ==
(In reply to comment #47)
> Created attachment 125084 [details]
> Patch to return trap number correctly from __kvmppc_vcore_entry
>
> This patch fixes a bug in the assembly code in the KVM guest exit path which
> could cause the trap number in r12 to get overwritten. The effect of this
> would be that if the exit is due to a doorbell interrupt from another CPU,
> we would effectively discard the doorbell interrupt and not handle it after
> returning to host context. I think this is why we are seeing CPUs locked up
> in smp_call_function_many() - their IPI to a CPU got ignored if the CPU was
> in a KVM guest, hence that other CPU never ran the requested function, hence
> the CPU doing smp_call_function_many() would wait forever. In many cases
> that CPU was holding spinlocks and then other CPUs would wait forever in
> raw_spin_lock().

Commit a8b48a4dcc contains the fix with upstream kernel.
It reads..

KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry

break-fixes: fd7bacbca47a86a6f418440d8a5d7b7edbb2f8f9 a8b48a4dccea77e29462e59f1dbf0d5aa1ff167c

Revision history for this message
bugproxy (bugproxy) wrote : Kernel log - xmon

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-164521 severity-critical targetmilestone-inin1804
Revision history for this message
bugproxy (bugproxy) wrote : Backtrace all CPUs - xmon

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Console log - Post SysRq crash dump invocation

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : sosreport - host (boslcp3)

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Attached boslcp3 host guest xmls

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Patch to return trap number correctly from __kvmppc_vcore_entry

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : boslcp3 host console logs

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : guests console logs

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → Critical
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: triage-g
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → Critical
Manoj Iyer (manjo)
Changed in linux (Ubuntu Bionic):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
description: updated
tags: added: kernel-bug-break-fix
Changed in linux (Ubuntu Bionic):
status: Triaged → Fix Committed
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

This has been committed to Bionic, as part of another bug.

However, it looks like this would affect Artful as well. Would you be able to confirm and test a patched kernel for Artful?

Thanks.
Cascardo.

Manoj Iyer (manjo)
Changed in linux (Ubuntu Artful):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
Changed in linux (Ubuntu Artful):
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-04-09 14:44 EDT-------
So... Can we declare success on this?

Revision history for this message
bugproxy (bugproxy) wrote : Attached boslcp3 host guest xmls

Default Comment by Bridge

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-05-29 05:24 EDT-------
(In reply to comment #122)
> This has been committed to Bionic, as part of another bug.
>
> However, it looks like this would affect Artful as well. Would you be able
> to confirm and test a patched kernel for Artful?
>
> Thanks.
> Cascardo.

Hello Canonical,

IBM won't be able to test fix for Artful i.e. 17.10 as its no more
being used and LTS one is only being used.

This fix is already released and verified for Bionic (18.04)
so i think we can go ahead and close this bug now.

Manoj Iyer (manjo)
Changed in linux (Ubuntu Artful):
status: Incomplete → Won't Fix
Changed in ubuntu-power-systems:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.