nested kvm fails with trusty and upstream kernels

Bug #1278531 reported by Serge Hallyn
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Unassigned

Bug Description

First: the 3.2 precise kernel handled nested qemu very well. As of saucy it has declined.

In a host with saucy kernel (even on precise userspace), attempts to do nested kvm result in a hung kvm (inside the guest - host proceeds ok) taking 100%cpu.

In a host with trusty kernel (even on precise userspace), nested kvm fails get past grub. I have two screenshots, one resulting from attempting to boot from a precise mini-iso, another from attempting to boot a cloud image at: http://cloud-images.ubuntu.com/quantal/current/quantal-server-cloudimg-amd64-disk1.img. (If you convert that image to raw, it fails the same way).

On the host, I see the following in /var/log/kern.log: kvm: zapping shadow pages for mmio generation wraparound

I've reproduced this both on (a) a intel based vostro laptop - with separate installs of precise and saucy (with ubuntu precise, saucy, trust, upstream kernels), (b) an intel based server with precise userspace and saucy and trusty kernels; and (c) an intel laptop running full uptodate trusty.

As nested qemu worked will in the previous LTS, I think it is important to have it working in 14.04 LTS.

============= Original description ================
I have a precise host with saucy ubuntu kernel installed. I installed two VMs there, a saucy and a trusty guest.

In the saucy guest, non-accelerated qemu works fine, but accelerated kvm hangs the first-level saucy guest completely, and pins it at 200% cpu usage:

   qemu-system-x86 --enable-kvm -monitor stdio -vnc :1

On the trusty guest it works just fine.

Tags: saucy
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1278531

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: saucy
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: nested kvm on saucy kernel hangs

apport-collect isn't working for me - I authenticate (on another box) using the given link to allow the box to upload to the bug, but it keeps trying to open a new link in w3m.

In any case, there are no messages at all in /var/log/kern.log, unfortunately - either on host or in guest. It simply hangs, hard.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: nested kvm fails with trust and upstream kernels

Screenshot trying to boot a mini-iso from http://archive.ubuntu.com/ubuntu/dists/${release}/main/installer-${arch}/current/iamges/netboot/mini.iso, with both trusty and precise amd64, using command line:

kvm -enable-kvm -drive file=x.img,if=virtio.cache=none -cdrom precise-mini-amd64.iso -boot d -vnc :1

Running the command inside a 12.04.4 guest on a precise host with trusty
kernel.

description: updated
summary: - nested kvm on saucy kernel hangs
+ nested kvm fails with trust and upstream kernels
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Screenshot trying to boot a cloud image from
http://cloud-images.ubuntu.com/quantal/current/quantal-server-cloudimg-amd64-disk1.img

The same thing happens using another cloud image release, as well
as converting the (qcow2) image to a raw image.

kvm -drive quantal-server-cloudimg-amd64-disk1.img,if=virtio,format=qcow2,cache=none -vnc :2 -m 512 -enable-kvm

Running the command inside a 12.04.4 guest on a precise host with trusty
kernel. However the exact same thing happens running the command in a
trusty guest on a trusty host.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note that all of these do work if you run

qemu-system-x86_64 -machine accel=tcg <rest-of-args> instead of kvm (and do not add -enable-kvm :)

tags: removed: kernel-da-key
Revision history for this message
Stefan Bader (smb) wrote :

Nested kvm on Intel (vmx) unfortunately saw quite a bit of regression starting with kernel v3.10 by
  commit 5f3d5799974b89100268ba813cec8db7bd0693fb
  KVM: nVMX: Rework event injection and recovery
Then there were several changes to nested VMX until v3.12 where things seemed to work again. Sounds a bit like 3.13 again does something bad. Saucy problems would be bug #1208455 and there is another issue right now with 32bit kvm on Trusty hosts which is tracked as bug #1268906 (just for having references).

We need to see what we can do about Saucy, the problem is that v3.11 sits right in the middle of meddling around with nested VMX. So going back may require as much change as going forward. And either way is a risk (for other regressions).

The message about zapping shadow pages looks to be rather some forgotten debug code. Some index is initialized in a way that causes that to happen quite early and is supposed to ensure that case is tested (maybe it still is not, who knows, but should be less likely).

From your description it sounds like some nested VMX (again) but just to make sure I got this right. The failing combination is:
- Host: P user-space, T kernel; Lvl1: P user-space, P kernel; Lvl2: T user-space, T kernel
- Host: T user-space, T kernel; Lvl1: T user-space, T kernel; Lvl2: T user-space, T kernel
Is that correct or did I get that wrong?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note that "what we can do about saucy" - I don't really care, except in as much as debugging it will help to fix it in trusty.

You said:

>From your description it sounds like some nested VMX (again) but just to make
>sure I got this right. The failing combination is:
> - Host: P user-space, T kernel; Lvl1: P user-space, P kernel; Lvl2: T user-space, T kernel
-> Host: T user-space, T kernel; Lvl1: T user-space, T kernel; Lvl2: T user-space, T kernel

Right, except that Lvl2 can be anything. precise, quantal trusty, all fail, all in the same way, in lvl2.

So the point is that levels 1 and 2 don't matter, nor does the qemu userspace
on the host. Only the host kernel matters.

Revision history for this message
Stefan Bader (smb) wrote :

Ok, I can recreate this locally. Right now I checked that a v3,12 kernel on the host seems to work, but our current v3.13 and an upstream v3.14-rc4 both have the hang (only looking at booting the mini.iso). So this is a regression in v3.13 that is not resolved, yet. Next I will try to bisect v3.12..v3.13.

Stefan Bader (smb)
summary: - nested kvm fails with trust and upstream kernels
+ nested kvm fails with trusty and upstream kernels
Revision history for this message
Stefan Bader (smb) wrote :

Bisection ended on:

commit e504c9098ed6acd9e1079c5e10e4910724ad429f
Author: Anthoine Bourgeois <email address hidden>
Date: Wed Nov 13 11:45:37 2013 +0100

    kvm, vmx: Fix lazy FPU on nested guest

which sounds reasonable as this will cause an exit to L0 (the host) in some case where it has not before. I am preparing Trusty kernels with that patch reverted to verify. Then need to ask upstream whether this maybe misses a counterpart in L0 handling or has other issues.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1278531] Re: nested kvm fails with trusty and upstream kernels

Awesome, thanks Stefan.

Revision history for this message
Stefan Bader (smb) wrote :

Ok, so kernels to try would be at http://people.canonical.com/~smb/lp1278531/. I was testing only the mini.iso boot with a host=T, L1=P combination. But that at least brought up the boot menu. Serge, if you have time, maybe you can verify this on one of the other cases.

Revision history for this message
Stefan Bader (smb) wrote :

L1 with T user-space currently crashes on me due to some compiz problem.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Candidate kernels do indeed fix the problem. Nesting now works both with precise and trusty userspace on the host.

Thanks Stefan!

Revision history for this message
Seth Forshee (sforshee) wrote :

I saw https://lkml.org/lkml/2014/2/27/819 come across lkml and am assuming it's meant to be a fix for this issue. Since Stefan's out today I went ahead and made a test build so you can see if it fixes the issue.

http://people.canonical.com/~sforshee/lp1278531/linux-3.13.0-14.34+lp1278531v201402280817/

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Seth,

would thie be any different from the kernel mentioned in comment #11?
(I haven't looked at the source for that one, but it did fix the issue
for me)

Revision history for this message
Seth Forshee (sforshee) wrote :

The way I understood smb's comments that one just reverted the patch identified by the bisect. I can't confirm that with him right now though, and it doesn't look like he included whatever patches he had added for us to look at. My build includes a patch which was submitted upstream within the past day, references the commit Stefan identified, and has a reported-by: Stefan tag, so it seems likely that it's targeted at this issue. But if you'd rather wait to hear from smb that's fine too.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I see, thanks. I'll try out the other ppa as soon as my test laptop
has power :)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This kernel does indeed work fine as well - thanks!

Revision history for this message
Stefan Bader (smb) wrote :

Cool, so we need to get that pulled into stable. Sorry for being unclear before. Yes, my kernel just had the patch from bisection reverted. The kernel Seth made had the fix for that on top.

Revision history for this message
Stefan Bader (smb) wrote :

Oh, just saw that the patch actually was sent to upstream stable already.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Trusty):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.13.0-15.35

---------------
linux (3.13.0-15.35) trusty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1287175

  [ Andy Whitcroft ]

  * [Config] tools -- enable cpupower on ppc64el
  * [Config] ppc64el -- enable perf tools
  * [Config] powerpc -- enable perf tools
  * [Config] ppc64el -- reduce MAX_ORDER with 64k pages
  * [Config] ppc64el -- switch to 64K system pages

  [ Benjamin Herrenschmidt ]

  * SAUCE: powerpc/powernv: Add iommu DMA bypass support for IODA2

  [ Paul Mackerras ]

  * SAUCE: powerpc: Increase stack redzone for 64-bit userspace to 512
    bytes

  [ Upstream Kernel Changes ]

  * perf trace: Fix ioctl 'request' beautifier build problems on !(i386 ||
    x86_64) arches
  * kvm, vmx: Really fix lazy FPU on nested guest
    - LP: #1278531
 -- Tim Gardner <email address hidden> Mon, 03 Mar 2014 13:22:56 +0000

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.