qemu guest hangs on nested kvm startup with host kernel oops
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | linux (Ubuntu) |
Undecided
|
Unassigned | ||
| | Utopic |
Medium
|
Chris J Arges | ||
Bug Description
[Impact]
Users of nested KVM may experience the L1 VM hanging when booting an L2 VM. Overall this seems to be due to issues with external interrupts not reaching L1 when L2 gets booted.
[Test Case]
Run a nested KVM instance:
https:/
[Fix]
commit 4fa7734c62cdd8c
commit f3380ca5d7edb5e
--
I'm creating a vivid qemu guest on a trusty host with 3.13.0-48-generic kernel. When I start a guest inside that guest, I get the oops below on the host while the first guest hangs and must be (virsh) destroyed.
Apr 24 20:40:08 sergeh2 kernel: [1575627.844208] ------------[ cut here ]------------
Apr 24 20:40:08 sergeh2 kernel: [1575627.844227] WARNING: CPU: 2 PID: 17176 at /build/
Apr 24 20:40:08 sergeh2 kernel: [1575627.844229] Modules linked in: vhost_net vhost macvtap macvlan xts gf128mul xt_conntrack ipt_REJECT ip6table_filter ip6_tables ebtable_nat ebtables veth xt_nat xt_CHECKSUM iptable_mangle ipt_MASQUERADE
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables dm_crypt gpio_ich coretemp kvm_intel kvm i7core_edac edac_core lpc_ich shpchp mac_hid serio_raw lp parp
ort btrfs libcrc32c raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid0 multipath linear dm_snapshot raid1 nouveau mxm_wmi video i2c_algo_bit ttm drm_kms_helper drm ahci r8169 libahci mii wmi
Apr 24 20:40:08 sergeh2 kernel: [1575627.844281] CPU: 2 PID: 17176 Comm: qemu-system-x86 Not tainted 3.13.0-48-generic #80-Ubuntu
Apr 24 20:40:08 sergeh2 kernel: [1575627.844283] Hardware name: MSI MS-7522/MSI X58 Pro (MS-7522) , BIOS V8.14B8 11/09/2012
Apr 24 20:40:08 sergeh2 kernel: [1575627.844286] 0000000000000009 ffff880907561c98 ffffffff81721506 0000000000000000
Apr 24 20:40:08 sergeh2 kernel: [1575627.844290] ffff880907561cd0 ffffffff810677dd ffff880bfa808000 0000000000000014
Apr 24 20:40:08 sergeh2 kernel: [1575627.844293] ffff8806da7a7000 ffff880bfca9c800 0000000000000000 ffff880907561ce0
Apr 24 20:40:08 sergeh2 kernel: [1575627.844297] Call Trace:
Apr 24 20:40:08 sergeh2 kernel: [1575627.844305] [<ffffffff81721
Apr 24 20:40:08 sergeh2 kernel: [1575627.844310] [<ffffffff81067
Apr 24 20:40:08 sergeh2 kernel: [1575627.844314] [<ffffffff81067
Apr 24 20:40:08 sergeh2 kernel: [1575627.844321] [<ffffffffa081f
Apr 24 20:40:08 sergeh2 kernel: [1575627.844327] [<ffffffffa081f
Apr 24 20:40:08 sergeh2 kernel: [1575627.844347] [<ffffffffa03b7
Apr 24 20:40:08 sergeh2 kernel: [1575627.844364] [<ffffffffa03bb
Apr 24 20:40:08 sergeh2 kernel: [1575627.844376] [<ffffffffa03a5
Apr 24 20:40:08 sergeh2 kernel: [1575627.844381] [<ffffffff810aa
Apr 24 20:40:08 sergeh2 kernel: [1575627.844387] [<ffffffff811ff
Apr 24 20:40:08 sergeh2 kernel: [1575627.844391] [<ffffffff811d1
Apr 24 20:40:08 sergeh2 kernel: [1575627.844406] [<ffffffffa03b0
Apr 24 20:40:08 sergeh2 kernel: [1575627.844409] [<ffffffff811d1
Apr 24 20:40:08 sergeh2 kernel: [1575627.844414] [<ffffffff81731
Apr 24 20:40:08 sergeh2 kernel: [1575627.844416] ---[ end trace 351396e62b6ef224 ]---
Apr 24 20:48:29 sergeh2 dnsmasq-dhcp[1409]: DHCPREQUEST(lxcbr0) 10.0.3.104 00:16:3e:72:73:32
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-
ProcVersionSign
Uname: Linux 3.13.0-48-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Apr 10 14:22 seq
crw-rw---- 1 root audio 116, 33 Apr 10 14:22 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
Date: Fri Apr 24 20:59:31 2015
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: MSI MS-7522
PciMultimedia:
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:
dmi.bios.date: 11/09/2012
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V8.14B8
dmi.board.
dmi.board.name: MSI X58 Pro (MS-7522)
dmi.board.vendor: MSI
dmi.board.version: 3.0
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: MS-7522
dmi.product.
dmi.sys.vendor: MSI
| Serge Hallyn (serge-hallyn) wrote : | #1 |
| Changed in linux (Ubuntu): | |
| status: | New → Incomplete |
| Changed in linux (Ubuntu): | |
| importance: | Undecided → Medium |
| tags: | added: kernel-da-key |
| tags: | added: bot-stop-nagging |
| Changed in linux (Ubuntu): | |
| status: | Incomplete → Triaged |
| Chris J Arges (arges) wrote : | #3 |
1) Can you dump the domain XML (if using libvirt), or qemu command used to invoke the VM. I'm wondering if there is some cpu feature mismatch going on.
2) Can you do 'tail /sys/module/
Looks like WARN here:
/*
* Emulate an exit from nested guest (L2) to L1, i.e., prepare to run L1
* and modify vmcs12 to make it see what it would expect to see there if
* L2 was its real guest. Must only be called when in L2 (is_guest_mode())
*/
static void nested_
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
int cpu;
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
/* trying to cancel vmlaunch/vmresume is a bug */
| Changed in linux (Ubuntu): | |
| assignee: | nobody → Chris J Arges (arges) |
| Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1448269] Re: qemu guest hangs on nested kvm startup with host kernel oops | #4 |
Quoting Chris J Arges (<email address hidden>):
> 1) Can you dump the domain XML (if using libvirt), or qemu command used to invoke the VM. I'm wondering if there is some cpu feature mismatch going on.
<domain type='kvm'>
<name>p9</name>
<uuid>
<memory unit='KiB'
<currentMemory unit='KiB'
<vcpu placement=
<os>
<type arch='x86_64' machine=
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='utc'/>
<on_poweroff>
<on_reboot>
<on_crash>
<devices>
<emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='unsafe'/>
<source file='/
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='unsafe'/>
<source file='/
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
<controller type='usb' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<interface type='network'>
<mac address=
<source network='default'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'>
<listen type='address' address=
</graphics>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</memballoon>
</devices>
</domain>
> 2) Can you do 'tail /sys/module/
==> /sys/module/
Y
==> /sys/module/
N
==> /sys/module/
N
==> /sys/module/
Y
==> /sys/module/
N
==> /sys/module/
Y
==> /sys/module/
Y
==> /sys/module/
Y
==> /sys/module/
0
==> /sys/module/
4096
==> /sys/module/
N
==> /sys/module/
Y
==> /sys/module/
| Serge Hallyn (serge-hallyn) wrote : | #5 |
With 3.16.0-37-generic on the host, i no longer get an oops on the host, and the guest instantly reboots rather than hanging.
| Serge Hallyn (serge-hallyn) wrote : | #6 |
This happens with both (first-level) guest kernels 3.19.0-20-generic and 3.19.0-18-generic
| Changed in linux (Ubuntu): | |
| status: | Triaged → In Progress |
| Stefan Kooman (stefan-n1) wrote : | #7 |
I had this issue (kernel oops) when I explicitly disabled "ignore_msrs" (Ubuntu Trusty, 3.13.0-54-generic):
cat /etc/modprobe.
options kvm-intel nested=y ept=y
options kvm ignore_msrs=1
Removing the option "kvm ignore_msrs=1" made the guest (L2) run but give a "KVM: entry failed, hardware error 0x7" when the L1 guest (guest hypervisor) was booted with these cpu flags (libvirt):
<cpu> <arch>x86_64</arch> <model>
Guest VM (L2) running fine with these cpu flags (libvirt) for guest VM (L1):
<cpu match='exact'> <cpu mode='host-
@Serge Hallyn: I wonder what cpu parameters you have defined for your L1 guest (guest hypervisor)
| Serge Hallyn (serge-hallyn) wrote : | #8 |
I had made no changes, i.e.
options kvm_intel nested=1
I've changed that to 0 to get past crashes during automated testing,
but have not added ignore_msrs=1.
| Chris J Arges (arges) wrote : | #9 |
Ok I can repro on this end. I'll start debugging this.
| Changed in linux (Ubuntu Utopic): | |
| assignee: | nobody → Chris J Arges (arges) |
| Changed in linux (Ubuntu): | |
| assignee: | Chris J Arges (arges) → nobody |
| status: | In Progress → Fix Released |
| Changed in linux (Ubuntu Utopic): | |
| importance: | Undecided → Medium |
| Changed in linux (Ubuntu): | |
| importance: | Medium → Undecided |
| Changed in linux (Ubuntu Utopic): | |
| status: | New → In Progress |
| description: | updated |
| Chris J Arges (arges) wrote : | #10 |
A test build with the fix is available here:
http://
| Chris J Arges (arges) wrote : | #11 |
I've tested this on my own workstation starting nested VMs and also testing other types of VMs I run on my system.
In addition I've also run kvm-unit-tests on this patchset and it has the same results as before the patches.
| Serge Hallyn (serge-hallyn) wrote : | #12 |
@arges
that kernel allows me to run accelerated kvm nested - thanks!
| Changed in linux (Ubuntu Utopic): | |
| status: | In Progress → Fix Released |
| status: | Fix Released → Fix Committed |
| Brad Figg (brad-figg) wrote : | #13 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
| tags: | added: verification-needed-utopic |
| Chris J Arges (arges) wrote : | #14 |
Verified this on my desktop.
| tags: |
added: verification-done-utopic removed: verification-needed-utopic |
| Launchpad Janitor (janitor) wrote : | #15 |
This bug was fixed in the package linux - 3.16.0-44.59
---------------
linux (3.16.0-44.59) utopic; urgency=low
[ Brad Figg ]
* Release Tracking Bug
- LP: #1472030
[ Iyappan Subramanian ]
* SAUCE: (no-up) drivers: net: xgene: fix: Out of order descriptor bytes
read
- LP: #1425576
[ Upstream Kernel Changes ]
* Revert "tools/vm: fix page-flags build"
- LP: #1471170
* NVMe: Add shutdown timeout as module parameter.
- LP: #1465136
* Drivers: hv: vmbus: Add support for VMBus panic notifier handler
- LP: #1463584
* Drivers: hv: vmbus: Correcting truncation error for constant
HV_
- LP: #1463584
* KVM: nVMX: fix lifetime issues for vmcs02
- LP: #1448269
* KVM: nVMX: Fix nested vmexit ack intr before load vmcs01
- LP: #1448269
* mm/slab_common: support the slub_debug boot option on specific object
size
- LP: #1456952
* kvm: x86: fix kvm_apic_has_events to check for NULL pointer
* cpuidle: powernv: Populate cpuidle state details by querying the
device-tree
- LP: #1470404
* cpuidle: powernv: Read target_residency value of idle states from DT if
available
- LP: #1470404
* cpuidle: powernv: Avoid endianness conversions while parsing DT
- LP: #1470404
* cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
- LP: #1470404
* iio: adis16400: Report pressure channel scale
- LP: #1471170
* iio: adis16400: Use != channel indices for the two voltage channels
- LP: #1471170
* iio: adis16400: Compute the scan mask from channel indices
- LP: #1471170
* iio: adis16400: Remove unused variable
- LP: #1471170
* iio: adis16400: Fix burst mode
- LP: #1471170
* iio: adis16400: Fix burst transfer for adis16448
- LP: #1471170
* USB: serial: ftdi_sio: Add support for a Motion Tracker Development
Board
- LP: #1471170
* iio: adc: twl6030-gpadc: Fix modalias
- LP: #1471170
* serial: imx: Fix DMA handling for IDLE condition aborts
- LP: #1471170
* usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros
- LP: #1471170
* ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
- LP: #1471170
* n_tty: Fix auditing support for cannonical mode
- LP: #1471170
* drm/i915/hsw: Fix workaround for server AUX channel clock divisor
- LP: #1471170
* x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
- LP: #1471170
* lib: Fix strnlen_user() to not touch memory after specified maximum
- LP: #1471170
* Input: elantech - fix detection of touchpads where the revision matches
a known rate
- LP: #1471170
* ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
- LP: #1471170
* ALSA: usb-audio: add MAYA44 USB+ mixer control names
- LP: #1471170
* ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)
- LP: #1471170
* USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
- LP: #1471170
* Input: elantech - add new icbody type
- LP: #1471170
* MIPS: Fix enabling of DEBUG_STACKOVERFLOW
- LP: #1471170
* xfrm: fix a race in xfrm_state_
...
| Changed in linux (Ubuntu Utopic): | |
| status: | Fix Committed → Fix Released |
| Phil Regnauld (regnauld-f) wrote : | #16 |
I'm seeing a similar issue with trying to install the latest 16.04.1 i386 edition (http://
qemu-kvm hangs, have to virsh destroy/kill the KVM process.
cat /etc/modprobe.
options kvm_intel nested=1
... and kvm-ok says acceleration is enabled.
Should I open a new bug report on this ?
| Serge Hallyn (serge-hallyn) wrote : | #17 |
Please open a new bug - thanks.


This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1448269
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.