Nested kvm guest fails to start on a emulated Westmere CPU guest under a Broadwell CPU host

Bug #1665389 reported by Nadav Goldin
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

Using latest master(5dae13), qemu fails to start any nested guest in a Westmere emulated guest(layer 1), under a Broadwell host(layer 0), with the error:

qemu-custom: /root/qemu/target/i386/kvm.c:1849: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

The qemu command used(though other CPUs didn't work either):
/usr/bin/qemu-custom -name guest=12ed9230-vm-el73,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-12ed9230-vm-el73/master-key.aes -machine pc-i440fx-2.9,accel=kvm,usb=off -cpu Westmere,+vmx -m 512 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -object iothread,id=iothread1 -uuid f4ce4eba-985f-42a3-94c4-6e4a8a530347 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-5-12ed9230-vm-el73/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot menu=off,strict=on -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/root/lago/.lago/default/images/vm-el73_root.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,serial=1,discard=unmap -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:c0:a7:c8:02,bus=pci.0,addr=0x2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-5-12ed9230-vm-el73/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -msg timestamp=on
2017-02-16T15:14:45.840412Z qemu-custom: -chardev pty,id=charserial0: char device redirected to /dev/pts/2 (label charserial0)
qemu-custom: /root/qemu/target/i386/kvm.c:1849: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

The CPU flags in the Westmere guest:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc rep_good nopl pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm arat tpr_shadow vnmi flexpriority ept vpid

The guest kernel is 3.10.0-514.2.2.el7.x86_64.

The CPU flags of the host(Broadwell):
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp

qemu command on the host - Broadwell(which works):
/usr/bin/qemu-kvm -name 4ffcd448-vm-el73,debug-threads=on -S -machine pc-i440fx-2.6,accel=kvm,usb=off -cpu Westmere,+x2apic,+vmx,+vme -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -object iothread,id=iothread1 -uuid 8cc0a2cf-d25a-4014-acdb-f159c376a532 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-4-4ffcd448-vm-el73/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot menu=off,strict=on -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/home/ngoldin/src/nvgoldin.github.com/lago-init-files/.lago/flags-tests/default/images/vm-el73_root.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,serial=1,discard=unmap -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/home/ngoldin/src/nvgoldin.github.com/lago-init-files/.lago/flags-tests/default/images/vm-el73_additonal.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,serial=2,discard=unmap -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:c0:a8:c9:02,bus=pci.0,addr=0x2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-4-4ffcd448-vm-el73/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -msg timestamp=on

On the Broadwell host I'm using a distribution package if it matters(qemu-kvm-2.6.2-5.fc24.x86_64 and 4.8.15-200.fc24.x86_64)

As the error indicates, I think this assertion was put in:
commit 48e1a45c3166d659f781171a47dabf4a187ed7a5
Author: Paolo Bonzini <email address hidden>
Date: Wed Mar 30 22:55:29 2016 +0200

    target-i386: assert that KVM_GET/SET_MSRS can set all requested MSRs

    This would have caught the bug in the previous patch.

    Signed-off-by: Paolo Bonzini <email address hidden>

I tried going back one commit before to 273c515, and then the error is gone and the nested guest comes up as expected. If I try to run with head at the above commit(48e145c) the error output is slightly different, though it looks the same:
/root/qemu/target-i386/kvm.c:1713: kvm_put_msrs: Assertion `ret == n' failed.

Tags: kvm
Revision history for this message
Dr. David Alan Gilbert (dgilbert-h) wrote :

Hi Nadav,
  Can you clarify what the host and L1 kernels are please?

This error means that qemu tried to write some msrs but one of the msr writes failed; we need to figure out which one to understand what's going on.

1) Edit kvm_msr_entry_add in target/i386/kvm.c to something like:

    assert((void *)(entry + 1) <= limit);
    fprintf(stderr,"kvm_msr_entry_add: @%d index=%x value=%lx\n", msrs->nmsrs, index, value);
    entry->index = index;

2) edit kvm_put_msrs near the bottom:

    fprintf(stderr,"kvm_put_msrs: ret=%d expected=%d\n", ret, cpu->kvm_msr_buf->nmsrs);
    assert(ret == cpu->kvm_msr_buf->nmsrs);

Now with any luck the 'ret' value will tell you the entry which is bad, and you can match
that to the @%d value (or maybe it's the entry before that one which failed?) then we get the index, look it up in the intel docs and figure out which MSR it's complaining about.

Also, does the problem go away if you remove the +x2apic on the top level qemu?

Dave

Revision history for this message
Nadav Goldin (ngoldin) wrote :
Download full text (4.5 KiB)

Hi Dave, thanks for looking into this.

> Can you clarify what the host and L1 kernels are please?

Host - 4.8.15-200.fc24.x86_64
Guest - 3.10.0-514.2.2.el7.x86_64

Results of adding the debug messages and running a simpler command(with master again - 5dae13):

[root@vm-el73 ~]# /usr/bin/qemu-custom -s -machine pc,accel=kvm,usb=off -cpu Westmere
kvm_msr_entry_add: @0 index=174 value=0
kvm_msr_entry_add: @1 index=175 value=0
kvm_msr_entry_add: @2 index=176 value=0
kvm_msr_entry_add: @3 index=277 value=7040600070406
kvm_msr_entry_add: @4 index=c0000081 value=0
kvm_msr_entry_add: @5 index=c0010117 value=0
kvm_msr_entry_add: @6 index=3b value=0
kvm_msr_entry_add: @7 index=1a0 value=1
kvm_msr_entry_add: @8 index=9e value=30000
kvm_msr_entry_add: @9 index=d90 value=0
kvm_msr_entry_add: @10 index=c0000083 value=0
kvm_msr_entry_add: @11 index=c0000102 value=0
kvm_msr_entry_add: @12 index=c0000084 value=0
kvm_msr_entry_add: @13 index=c0000082 value=0
kvm_msr_entry_add: @14 index=10 value=0
kvm_msr_entry_add: @15 index=12 value=0
kvm_msr_entry_add: @16 index=11 value=0
kvm_msr_entry_add: @17 index=4b564d02 value=0
kvm_msr_entry_add: @18 index=4b564d04 value=0
kvm_msr_entry_add: @19 index=4b564d03 value=0
kvm_msr_entry_add: @20 index=2ff value=0
kvm_msr_entry_add: @21 index=250 value=0
kvm_msr_entry_add: @22 index=258 value=0
kvm_msr_entry_add: @23 index=259 value=0
kvm_msr_entry_add: @24 index=268 value=0
kvm_msr_entry_add: @25 index=269 value=0
kvm_msr_entry_add: @26 index=26a value=0
kvm_msr_entry_add: @27 index=26b value=0
kvm_msr_entry_add: @28 index=26c value=0
kvm_msr_entry_add: @29 index=26d value=0
kvm_msr_entry_add: @30 index=26e value=0
kvm_msr_entry_add: @31 index=26f value=0
kvm_msr_entry_add: @32 index=200 value=0
kvm_msr_entry_add: @33 index=201 value=0
kvm_msr_entry_add: @34 index=202 value=0
kvm_msr_entry_add: @35 index=203 value=0
kvm_msr_entry_add: @36 index=204 value=0
kvm_msr_entry_add: @37 index=205 value=0
kvm_msr_entry_add: @38 index=206 value=0
kvm_msr_entry_add: @39 index=207 value=0
kvm_msr_entry_add: @40 index=208 value=0
kvm_msr_entry_add: @41 index=209 value=0
kvm_msr_entry_add: @42 index=20a value=0
kvm_msr_entry_add: @43 index=20b value=0
kvm_msr_entry_add: @44 index=20c value=0
kvm_msr_entry_add: @45 index=20d value=0
kvm_msr_entry_add: @46 index=20e value=0
kvm_msr_entry_add: @47 index=20f value=0
kvm_msr_entry_add: @48 index=17a value=0
kvm_msr_entry_add: @49 index=17b value=ffffffffffffffff
kvm_msr_entry_add: @50 index=400 value=ffffffffffffffff
kvm_msr_entry_add: @51 index=401 value=0
kvm_msr_entry_add: @52 index=402 value=0
kvm_msr_entry_add: @53 index=403 value=0
kvm_msr_entry_add: @54 index=404 value=ffffffffffffffff
kvm_msr_entry_add: @55 index=405 value=0
kvm_msr_entry_add: @56 index=406 value=0
kvm_msr_entry_add: @57 index=407 value=0
kvm_msr_entry_add: @58 index=408 value=ffffffffffffffff
kvm_msr_entry_add: @59 index=409 value=0
kvm_msr_entry_add: @60 index=40a value=0
kvm_msr_entry_add: @61 index=40b value=0
kvm_msr_entry_add: @62 index=40c value=ffffffffffffffff
kvm_msr_entry_add: @63 index=40d value=0
kvm_msr_entry_add: @64 index=40e value=0
kvm_msr_entry_add: @65 index=40f value=0
kvm_msr_entry_add: @66 in...

Read more...

Revision history for this message
Dr. David Alan Gilbert (dgilbert-h) wrote :

   kvm_msr_entry_add: @8 index=9e value=30000
   kvm_msr_entry_add: @9 index=d90 value=0
   kvm_msr_entry_add: @10 index=c0000083 value=0

   kvm_put_msrs: ret=9 expected=90

OK, 9... so that's probably the d90;
  $ ag d90:
      418:#define MSR_IA32_BNDCFGS 0x00000d90

The Intel book says 'IA32_BNDCFGS Supervisor State of MPX Configuration' which I think is pretty new, so I don't know what it's doing trying to do it on a Westmere; ccing in Eduardo and Paolo.

  > Also, does the problem go away if you remove the +x2apic on the top level qemu?

    Don't think so, I actually added it because I thought it would solve something
    I can re-try with out it if needed(with the debug messages).

Worth a go, but if it's that MPX stuff I doubt it.

Dave

Revision history for this message
Paolo Bonzini (bonzini) wrote :

Nested is currently supported only with -cpu host. Kernel 4.9 has the necessary support but QEMU doesn't.

Revision history for this message
Nadav Goldin (ngoldin) wrote :

> Nested is currently supported only with -cpu host. Kernel 4.9 has the necessary support but QEMU doesn't.

1. Just to be sure, you mean '-cpu host' in the bare metal host right?
2. Until it is officially supported in qemu, is there any easy way to work around this? as this setup seemed to have work in earlier versions(prior to v2.6.0 I think).

Revision history for this message
Paolo Bonzini (bonzini) wrote :

Yes, I mean '-cpu host' in the bare metal. There is no workaround; it worked by chance and it triggered other hard to find bugs.

Revision history for this message
Nadav Goldin (ngoldin) wrote :

> Yes, I mean '-cpu host' in the bare metal. There is no workaround; it worked by chance and it triggered other hard to find bugs.

alright, thanks for clarifying that.

Revision history for this message
Thomas Huth (th-huth) wrote :

Can you still reproduce this issue with the latest version of QEMU (v4.2)?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.