Comment 12 for bug 1826051

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

"This is new to me, so hopefully it's correct."

We are just wandering together the road of "hopefully getting ideas here to actually look for", your active help on this is great!

Ok, so we see that PREEMPTION_TIMER and PENDING_INTERRUPT are high.
This almost seems as if a IRQ shall be delivered but isn't properly cleared/handled by the guest.
So it forever would spin as:
1. IRQ is pending -> exit PENDING_INTERRUPT
2. Use preemption timer for immediate VMExit [1]
3. Guest receives and mis-handles IRQ (MSR writes ont hat path)
4. guest leaves IRQ disabled area, but IRQ is still pending (or there is an infinite amount or anything), goto #1

I'd have hoped that the further counters would help a bit more but they are all zero which seems wrong. At least the exits should somewhat match. I rechecked the command that I told you and yes (sorry), in the form I added it it would only probe for the sleep command which obviously doesn't do anything.
But since you cleared the affected host you are good to just add -a to check globally for KVM counters.
 $ sudo perf stat -a -e 'kvm:*' sleep 30s
That should give you some numbers here as well.

I don't have a great idea yet what it is, but for the sake of "trying things" you might flip the IRQ delivery mechanism on the target system before migrating there. You could switch from the preemption_timer based mode to the older exit path with:
  kvm-intel.preemption_timer=0
On your kernel commandline or as option in /etc/modprobe.d/
After reboot check the value at:
  $ cat /sys/module/kvm_intel/parameters/preemption_timer
  Y
This should be Y by default and then flip to N.

Give this a try and report back the exits now that I fixed my missing -a.
If the numbers with disabled preemption_timer are massively different feel free to report those as well.

The perf profile of the host looks pretty much as expected matching the ongoing run/irq-delivery loop. The better numbers will help here if there are outliers we might miss.

Unfortunately the guest is unresponsive (so we can't easily check its /proc/interrupts for the types that are looping here. Furthermore your guest samples have no symbols (likely due to the kallysyms missing) which makes the guest still kind of a black box.
Unfortunately while attaching a debugger to the guest is possible [2] I'd not know a way to do this after the fact and since you need ti migrate to another host to trigger your issue we can't use that :-/
Until we have an idea for that guest debugging that works better in your case we might be in the dark here.

One approach we haven't started yet is checking for fixes in newer versions. For example bug 1791286 was somewhat similar, yet newer versions had it fixed and there were patches to backport.
Qemu 4.0 was just very recently released that isn't ready yet, but you could at your Bionic target install UCA-Stein [3]. That would lift the qemu 2.11 in Bionic to 3.1 which still is rather new. In the same fashion your target systems of the migrations could use e.g. Xenial+UCA

This would allow us to probe a few qemu releases for your case:
Xenial / Xenial+Ocata / Bionic / Bionic-Stein
Which matches qemu versions:
2.5 / 2.8 / 2.11 / 3.1

At a minimum it would be great if you could try 3.1 (that would not even need a new base OS deploy), the tests for X / X+ocata would need to install Xenial on a target machine which is more work :-/
Maybe after all we end up bisecting qemu either for the breaking commit since e.g. Xenial (if that worked) or for the fixing commit (if 3.1 works fine).
Let me know how much of that you can do, anything helps.

Note: I have asked around a bit but still have not heard of others affected by the same.
Therefore I wanted to ask one more thing, to further make your guests non-special - could you migrate one without ceph storage and let me know if this one would not fail?

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d264ee0c2ed20c6a426663590d4fc7a36cb6abd7
[2]: https://wiki.whamcloud.com/display/LNet/Kernel+GDB+live+Debugging+with+KVM
[3]: https://wiki.ubuntu.com/OpenStack/CloudArchive#Stein