2015-03-31 20:17:11 |
Chris J Arges |
description |
== Comment: #0 - Cyril Bur <cyrilbur@au1.ibm.com> - 2015-02-23 18:03:41 ==
+++ This bug was initially created as a clone of Bug #108455 +++
I was investigating the cause of some ppc64le KVM guest softlockup warnings. On a single CPU KVM guest, I ran something to keep the guest busy:
yes > /dev/null &
Followed by
(qemu) stop
Wait a while, then:
(qemu) cont
We get a softlockup error:
BUG: soft lockup - CPU#0 stuck for 9220s! [yes:2389]
.__getnstimeofday
.getnstimeofday
.ktime_get_real
.netif_receive_skb
.ibmveth_poll
.net_rx_action
.__do_softirq
.irq_exit
.__do_irq
.call_do_irq
.do_IRQ
I was going to file it away in the "don't do that" bin, but I notice x86 have something to detect a paused VM and avoid spewing the soft lockup error. Do we need something like this on ppc64?
commit 5d1c0f4a80a6df73395fb3fc2c302510f8f09d36
Author: Eric B Munson <emunson@mgebm.net>
Date: Sat Mar 10 14:37:28 2012 -0500
watchdog: add check for suspended vm in softlockup detector
A suspended VM can cause spurious soft lockup warnings. To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so. When the watchdog is reset successfully, clear the guest
paused flag.
== Comment: #1 - Cyril Bur <cyrilbur@au1.ibm.com> - 2015-02-23 18:03:55 ==
Hi,
I have been working on a fix for guest kernels. This requires two patches:
1/2
commit 545a2bf742fb41f17d03486dd8a8c74ad511dec2
Author: Cyril Bur <cyrilbur@gmail.com>
Date: Thu Feb 12 15:01:24 2015 -0800
kernel/sched/clock.c: add another clock for use with the soft lockup watchdog
and 2/2
commit 4be1b29795d692d512bb67b770665d6f8ea5cb0b
Author: Cyril Bur <cyrilbur@gmail.com>
Date: Thu Feb 12 15:01:28 2015 -0800
powerpc: add running_clock for powerpc to prevent spurious softlockup warnings
Both are in upstream. |
[Impact]
Stop/continue via qemu monitor can easily trigger soft-lockups due to incorrect VM timekeeping.
[Test Case]
On a single CPU KVM guest:
yes > /dev/null &
Followed by
(qemu) stop
Wait a while, then:
(qemu) cont
[Fix]
The following commits upstream:
commit 545a2bf742fb41f17d03486dd8a8c74ad511dec2
commit 4be1b29795d692d512bb67b770665d6f8ea5cb0b
--
== Comment: #0 - Cyril Bur <cyrilbur@au1.ibm.com> - 2015-02-23 18:03:41 ==
+++ This bug was initially created as a clone of Bug #108455 +++
I was investigating the cause of some ppc64le KVM guest softlockup warnings. On a single CPU KVM guest, I ran something to keep the guest busy:
yes > /dev/null &
Followed by
(qemu) stop
Wait a while, then:
(qemu) cont
We get a softlockup error:
BUG: soft lockup - CPU#0 stuck for 9220s! [yes:2389]
.__getnstimeofday
.getnstimeofday
.ktime_get_real
.netif_receive_skb
.ibmveth_poll
.net_rx_action
.__do_softirq
.irq_exit
.__do_irq
.call_do_irq
.do_IRQ
I was going to file it away in the "don't do that" bin, but I notice x86 have something to detect a paused VM and avoid spewing the soft lockup error. Do we need something like this on ppc64?
commit 5d1c0f4a80a6df73395fb3fc2c302510f8f09d36
Author: Eric B Munson <emunson@mgebm.net>
Date: Sat Mar 10 14:37:28 2012 -0500
watchdog: add check for suspended vm in softlockup detector
A suspended VM can cause spurious soft lockup warnings. To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so. When the watchdog is reset successfully, clear the guest
paused flag.
== Comment: #1 - Cyril Bur <cyrilbur@au1.ibm.com> - 2015-02-23 18:03:55 ==
Hi,
I have been working on a fix for guest kernels. This requires two patches:
1/2
commit 545a2bf742fb41f17d03486dd8a8c74ad511dec2
Author: Cyril Bur <cyrilbur@gmail.com>
Date: Thu Feb 12 15:01:24 2015 -0800
kernel/sched/clock.c: add another clock for use with the soft lockup watchdog
and 2/2
commit 4be1b29795d692d512bb67b770665d6f8ea5cb0b
Author: Cyril Bur <cyrilbur@gmail.com>
Date: Thu Feb 12 15:01:28 2015 -0800
powerpc: add running_clock for powerpc to prevent spurious softlockup warnings
Both are in upstream. |
|