nested KVM fails on intel hardware - KVM: entry failed, hardware error 0x0

Bug #1329434 reported by Chris J Arges
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Chris J Arges

Bug Description

[Impact]
Using nested KVM on some hypervisors doesn't work.

[Test Case]
A script to make this easier is posted here:
https://gist.github.com/arges/9d21c6da03a8c10d3980

1) enable nested KVM:
sudo modprobe -r kvm_intel
sudo modprobe kvm_intel nested=1
cat /sys/module/kvm_intel/parameters/nested
# should say Y
2) generate an L1 guest and then generate an L2 guest inside the L1 guest
- ensure L1 has enough memory to boot L2
- if using libvirt you may need to edit the default bridge to use a different subnet than the L1 guest
3) boot the L2 guest
4) L2 guest should boot

[Fix]

These three upstream patches needed to be backported to 3.13:

* 533558bcb69ef28aff81b6ae9acda8943575319f
  - This provides necessary code changes to make backporting easier. However vmx_leave_nested function was not yet added, so that function modification was dropped.

* b6b8a1451fc40412c57d10c94b62e22acab28f94
  - This patch is necessary in order to ensure that the L1 guest doesn't crash with just 696dfd95 applied. I had to remove mpx mentions from the cherry-pick as that feature hasn't been added yet.

* 696dfd95ba9838327a7013e5988ff3ba60dcc8c8
  - This patch fixes the issue and was the result of the bisection. The APIC virtualization features need to be disabled as they cause L2 guests to not boot depending on the CPU.

--

If the L2 guest doesn't boot you can see the log:
sudo cat /var/log/libvirt/qemu/nested-L2.log
<snip>
KVM: entry failed, hardware error 0x0
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000663
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 00000000 0000ffff
IDT= 00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 66 89 d8 66 e8 02 f7 ff ff 66 83 c4 0c 66 5b 66 5e 66 c3 <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jun 13 18:26 seq
 crw-rw---- 1 root audio 116, 33 Jun 13 18:26 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 14.04
IwConfig: Error: [Errno 2] No such file or directory
MachineType: Intel Corporation S2600WTT
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=b5d11a1d-c48f-4abf-a622-d1efe52fe97c ro
ProcVersionSignature: User Name 3.13.0-24.46-generic 3.13.9
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware 1.127.2
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty uec-images
Uname: Linux 3.13.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy libvirtd netdev plugdev sudo video
_MarkForUpload: True
dmi.bios.date: 05/06/2014
dmi.bios.vendor: Intel Corporation
dmi.bios.version: GRNDSDP1.86B.0030.R03.1405061547
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: S2600WTT
dmi.board.vendor: Intel Corporation
dmi.board.version: H30334-201
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 23
dmi.chassis.vendor: ...............................
dmi.chassis.version: ..................
dmi.modalias: dmi:bvnIntelCorporation:bvrGRNDSDP1.86B.0030.R03.1405061547:bd05/06/2014:svnIntelCorporation:pnS2600WTT:pvr....................:rvnIntelCorporation:rnS2600WTT:rvrH30334-201:cvn...............................:ct23:cvr..................:
dmi.product.name: S2600WTT
dmi.product.version: ....................
dmi.sys.vendor: Intel Corporation

Chris J Arges (arges)
affects: ubuntu → linux (Ubuntu)
Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
assignee: nobody → Chris J Arges (arges)
status: New → In Progress
importance: Undecided → Medium
Revision history for this message
Chris J Arges (arges) wrote :

Another report on different hardware:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1278531

I've been able to test with a mainline 3.16 kernel (dfb945473ae8528fd885607b6fa843c676745e0c)
and it worked fine. Time to bisect...

Revision history for this message
Chris J Arges (arges) wrote :

Work on v3.15, so fixed in Utopic.

Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Fix Released
Revision history for this message
Chris J Arges (arges) wrote :

v3.15rc7 fails
v3.15rc8 works

Fix is somewhere in there...

Revision history for this message
Chris J Arges (arges) wrote :

This commit fixes the issue:
696dfd95ba9838327a7013e5988ff3ba60dcc8c8

Revision history for this message
Chris J Arges (arges) wrote : BootDmesg.txt

apport information

tags: added: apport-collected uec-images
description: updated
Revision history for this message
Chris J Arges (arges) wrote : CRDA.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : Lspci.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : Lsusb.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : ProcModules.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : UdevDb.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote : UdevLog.txt

apport information

Revision history for this message
Chris J Arges (arges) wrote :

Attached info from an affected machine.

Revision history for this message
Chris J Arges (arges) wrote :

So I've been able to get this working with the following patches (and notes about how I resolved the conflicts)
533558bcb69ef28aff81b6ae9acda8943575319f (remove vmx_leave_nested)
b6b8a1451fc40412c57d10c94b62e22acab28f94 (remove vmx_mpx_supported)
f4124500c2c13eb1208c6143b3f6d469709dea10 (remove VMX_MISC_ACTIVITY_HLT)
696dfd95ba9838327a7013e5988ff3ba60dcc8c8

Revision history for this message
madbiologist (me-again) wrote :

Glad to hear that a fix is in the pipeline.

I don't know anything about KVM, but I saw this the other day:

http://lkml.iu.edu/hypermail/linux/kernel/1408.0/00981.html

Revision history for this message
Chris J Arges (arges) wrote :
Revision history for this message
Chris J Arges (arges) wrote :

Ok a simplified patchset, I've had this work for me with limited testing:
http://people.canonical.com/~arges/lp1329434v2/

Chris J Arges (arges)
description: updated
Chris J Arges (arges)
description: updated
Revision history for this message
mage2 (t-w-otto) wrote :

I have been testing the inital kernel, and so far it is looking good.
Im going to be installing the new version in just a few and will circle back

Revision history for this message
mage2 (t-w-otto) wrote :

Running newer patch. looks good so far.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Kiran Koushik Agrahara (kkoushik) wrote :

works for me - tested it on devstack

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
mage2 (t-w-otto) wrote :

I tested and I am currently using the latest patch. It works for me.

Chris J Arges (arges)
tags: added: verification-done-trusty
removed: verification-needed-trusty
Dave Chiluk (chiluk)
tags: added: ua
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.8 KiB)

This bug was fixed in the package linux - 3.13.0-36.63

---------------
linux (3.13.0-36.63) trusty; urgency=low

  [ Joseph Salisbury ]

  * Release Tracking Bug
    - LP: #1365052

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
    modify write.
    - LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
    assembly code
    - LP: #1358949

  [ Ming Lei ]

  * SAUCE: (no-up) Drop APM X-Gene SoC Ethernet driver
    - LP: #1360140
  * [Config] Drop XGENE entries
    - LP: #1360140
  * [Config] CONFIG_NET_XGENE=m for arm64
    - LP: #1360140

  [ Stefan Bader ]

  * SAUCE: Add compat macro for skb_get_hash
    - LP: #1358162
  * SAUCE: bcache: prevent crash on changing writeback_running
    - LP: #1357295

  [ Suman Tripathi ]

  * SAUCE: (no-up) arm64: Fix the csr-mask for APM X-Gene SoC AHCI SATA PHY
    clock DTS node.
    - LP: #1359489
  * SAUCE: (no-up) ahci_xgene: Skip the PHY and clock initialization if
    already configured by the firmware.
    - LP: #1359501
  * SAUCE: (no-up) ahci_xgene: Fix the link down in first attempt for the
    APM X-Gene SoC AHCI SATA host controller driver.
    - LP: #1359507

  [ Tuan Phan ]

  * SAUCE: (no-up) pci-xgene-msi: fixed deadlock in irq_set_affinity
    - LP: #1359514

  [ Upstream Kernel Changes ]

  * iwlwifi: mvm: Add a missed beacons threshold
    - LP: #1349572
  * mac80211: reset probe_send_count also in HW_CONNECTION_MONITOR case
    - LP: #1349572
  * genirq: Add an accessor for IRQ_PER_CPU flag
    - LP: #1357527
  * arm64: perf: add support for percpu pmu interrupt
    - LP: #1357527
  * cifs: sanity check length of data to send before sending
    - LP: #1283101
  * KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
    - LP: #1329434
  * KVM: nVMX: Rework interception of IRQs and NMIs
    - LP: #1329434
  * KVM: vmx: disable APIC virtualization in nested guests
    - LP: #1329434
  * HID: Add transport-driver functions to the USB HID interface.
    - LP: #1353021
  * ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host
    Controller driver.
    - LP: #1358498
  * fold d_kill() and d_free()
    - LP: #1354234
  * fold try_prune_one_dentry()
    - LP: #1354234
  * new helper: dentry_free()
    - LP: #1354234
  * expand the call of dentry_lru_del() in dentry_kill()
    - LP: #1354234
  * dentry_kill(): don't try to remove from shrink list
    - LP: #1354234
  * don't remove from shrink list in select_collect()
    - LP: #1354234
  * more graceful recovery in umount_collect()
    - LP: #1354234
  * dcache: don't need rcu in shrink_dentry_list()
    - LP: #1354234
  * lift the "already marked killed" case into shrink_dentry_list()
  * split dentry_kill()
    - LP: #1354234
  * expand dentry_kill(dentry, 0) in shrink_dentry_list()
    - LP: #1354234
  * shrink_dentry_list(): take parent's ->d_lock earlier
    - LP: #1354234
  * dealing with the rest of shrink_dentry_list() livelock
    - LP: #1354234
  * dentry_kill() doesn't need the second argument now
    - LP: #1354234
  * dcache: add missing lockdep annotation
    - LP: #1354234
  * fs: convert use of typedef ctl_table to struct ctl_table
 ...

Read more...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.