libvirt CPU model selection missing

Bug #1861643 reported by Enoch Leung
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Undecided
Unassigned
Eoan
Won't Fix
Undecided
Unassigned
qemu (Ubuntu)
Fix Released
Undecided
Unassigned
Eoan
Won't Fix
Undecided
Unassigned
virt-manager (Ubuntu)
Invalid
Undecided
Unassigned
Eoan
Invalid
Undecided
Unassigned

Bug Description

As of 5.4.0-0ubuntu5 on Ubuntu 19.10 x86_64, available CPU model selection as I can see from virt-manager is still missing some CPU models. Here's some info based on running "virsh capabilities"

Host: Lenovo L470 (UEFI=1.71), i3-7100u ==> I cannot select SkyLake-whatever in virt-manager as it is detected as Broadwell
    <cpu>
      <arch>x86_64</arch>
      <model>Broadwell-noTSX-IBRS</model>
      <vendor>Intel</vendor>
      <microcode version='202'/>
      <topology sockets='1' cores='2' threads='2'/>
      <feature name='vme'/>
      <feature name='ds'/>
      <feature name='acpi'/>
      <feature name='ss'/>
      <feature name='ht'/>
      <feature name='tm'/>
      <feature name='pbe'/>
      <feature name='dtes64'/>
      <feature name='monitor'/>
      <feature name='ds_cpl'/>
      <feature name='vmx'/>
      <feature name='est'/>
      <feature name='tm2'/>
      <feature name='xtpr'/>
      <feature name='pdcm'/>
      <feature name='osxsave'/>
      <feature name='f16c'/>
      <feature name='rdrand'/>
      <feature name='arat'/>
      <feature name='tsc_adjust'/>
      <feature name='mpx'/>
      <feature name='clflushopt'/>
      <feature name='intel-pt'/>
      <feature name='md-clear'/>
      <feature name='stibp'/>
      <feature name='ssbd'/>
      <feature name='xsaveopt'/>
      <feature name='xsavec'/>
      <feature name='xgetbv1'/>
      <feature name='xsaves'/>
      <feature name='pdpe1gb'/>
      <feature name='abm'/>
      <feature name='invtsc'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='1048576'/>
    </cpu>

ASRock B450M Pro (UEFI 3.90), Ryzen 3600 ==> no CPU model is available for selection in virt-manager, and I have to use <cpu mode='host-model' check='partial'> at the moment, really not preferred.
    <cpu>
      <arch>x86_64</arch>
      <model>EPYC-IBPB</model>
      <vendor>AMD</vendor>
      <microcode version='141561875'/>
      <topology sockets='1' cores='6' threads='2'/>
      <feature name='ht'/>
      <feature name='osxsave'/>
      <feature name='cmt'/>
      <feature name='clwb'/>
      <feature name='umip'/>
      <feature name='xsaves'/>
      <feature name='mbm_total'/>
      <feature name='mbm_local'/>
      <feature name='cmp_legacy'/>
      <feature name='extapic'/>
      <feature name='ibs'/>
      <feature name='skinit'/>
      <feature name='wdt'/>
      <feature name='tce'/>
      <feature name='topoext'/>
      <feature name='perfctr_core'/>
      <feature name='perfctr_nb'/>
      <feature name='invtsc'/>
      <feature name='wbnoinvd'/>
      <feature name='amd-ssbd'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='1048576'/>
    </cpu>

Expected: CPU model <= host CPU model should be selectable. in case of i3-7100u, it means SkyLake-Client should be available; in case of Ryzen 3600, at least EPYC should be available.

CVE References

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Issue confirmed on Eoan, but not existing on Bionic nor on Focal.
On Focal I get Broadwell, Haswell, Ivy and Nehalem types in addition to the old set (my real chip is a i7-8550U so that seems ok for my case).

It seems to contain only the <=core2duo chips which means it most likely filters out on some feature that it considers required for all the other more modern types.

I don't see the mistake in libvirt - the detection is correct and mostly based on cpu features and cpuids.

I was trying to force a new chip which you could try as well.
To do so either use virsh-edit or in virt-manager "preferences"->"enable XML editing" and then set a new type like the "SkyLake-Client" you look for.
If you then afterwards try to start the guest it will list you which features it is missing.
Please report those back to compare with the definition.

Also could you attach a full:
$ virsh capabilities
$ virsh domcapabilities
$ cat /proc/cpuinfo

Changed in libvirt (Ubuntu):
status: New → Fix Released
Changed in libvirt (Ubuntu Eoan):
status: New → Incomplete
affects: libvirt (Ubuntu) → virt-manager (Ubuntu)
Changed in libvirt (Ubuntu):
status: New → Invalid
Changed in libvirt (Ubuntu Eoan):
status: New → Invalid
Revision history for this message
Enoch Leung (leun0036) wrote :

Here's the one with my L470 (xml + txt). I will provide the ones with my Ryzen later.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thank you for the update. Please be sure to switch the bug status back to "new" once you have attached the rest of the information.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hrm,
virt-manager between Eoan and Focal doesn't really differ enough to be the reason for this.
Maybe it is a libvirt issue after all.

I need to debug what exactly virt-manager probes to make up this list ...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah so it might come down to:
 $ virsh domcapabilities | grep "model usable='yes'"

This list matches what I can select on my system in virt-manager.
And it is tremendously shorter between Eoan and Focal (on the same system).

Also applying the same Filter to your XML confirms that there Skylake is missing which was your initial bug report (unable to select that). Your list at least has Broadwell, Haswell, Sandy, Westmere ...

Changed in libvirt (Ubuntu Eoan):
status: Invalid → Confirmed
Changed in virt-manager (Ubuntu Eoan):
status: Incomplete → Invalid
Changed in libvirt (Ubuntu):
status: Invalid → Fix Released
Changed in virt-manager (Ubuntu):
status: Fix Released → Invalid
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Using the backports of libvirt we have in the Ubuntu Cloud Archive I was upgrading through different versions and comparing the behavior:

Note: Clearing /var/cache/libvirt/qemu/capabilities/ manually to be sure between every test.

Interestingly I used LXD containers to do my test and that seems to have an effect as well.
Whatever container protection/isolation takes place here might be related to this.

Also libvirt probes things from qemu, so the qemu version can be important as well.

#1 Bionic - as-is (libvirt 4.0 qemu 2.11) - Container -> Small List
#2 Bionic + Stein (libvirt 5.0 qemu 2.11) - Container -> Small List
#3 Bionic + Train (libvirt 5.4 qemu 2.11) - Container -> Small List
#4 Bionic + Focal-Binaries (libvirt 6.0 qemu 4.2) - Container -> Small List
#5 Bionic - as-is (libvirt 4.0 qemu 2.11) - on Host -> Big List
#6 Focal - (libvirt 6.0 qemu 4.2) - Container - -> Big List

What .... "§$%&/(!
Two mysteries to solve:
- What might make #4 differ from #6?
- What might make #1 differ from #5?

Small List:
      <model usable='yes'>qemu32</model>
      <model usable='yes'>pentium3</model>
      <model usable='yes'>pentium2</model>
      <model usable='yes'>pentium</model>
      <model usable='yes'>n270</model>
      <model usable='yes'>kvm64</model>
      <model usable='yes'>kvm32</model>
      <model usable='yes'>coreduo</model>
      <model usable='yes'>core2duo</model>
      <model usable='yes'>athlon</model>
      <model usable='yes'>Westmere</model>
      <model usable='yes'>Penryn</model>
      <model usable='yes'>Opteron_G2</model>
      <model usable='yes'>Opteron_G1</model>
      <model usable='yes'>Nehalem</model>
      <model usable='yes'>Conroe</model>
      <model usable='yes'>486</model>

Big List:
      <model usable='yes'>qemu32</model>
      <model usable='yes'>pentium3</model>
      <model usable='yes'>pentium2</model>
      <model usable='yes'>pentium</model>
      <model usable='yes'>n270</model>
      <model usable='yes'>kvm64</model>
      <model usable='yes'>kvm32</model>
      <model usable='yes'>coreduo</model>
      <model usable='yes'>core2duo</model>
      <model usable='yes'>Westmere</model>
      <model usable='yes'>Westmere-IBRS</model>
      <model usable='yes'>SandyBridge</model>
      <model usable='yes'>SandyBridge-IBRS</model>
      <model usable='yes'>Penryn</model>
      <model usable='yes'>Opteron_G1</model>
      <model usable='yes'>Nehalem</model>
      <model usable='yes'>Nehalem-IBRS</model>
      <model usable='yes'>IvyBridge</model>
      <model usable='yes'>IvyBridge-IBRS</model>
      <model usable='yes'>Haswell-noTSX</model>
      <model usable='yes'>Haswell-noTSX-IBRS</model>
      <model usable='yes'>Conroe</model>
      <model usable='yes'>Broadwell-noTSX</model>
      <model usable='yes'>Broadwell-noTSX-IBRS</model>
      <model usable='yes'>486</model>

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Back on this for a bit.
After some abstraction levels the probing is in libvirt around:

virQEMUDriverGetDomainCapabilities
-> virQEMUCapsFillDomainCaps
 ...
 -> virQEMUCapsGetCPUModels
 ...
   -> virQEMUCapsGetAccel(qemuCaps, type)->cpuModels
That is of the kvm struct that represents qemu-kvm

The struct contains mostly what we see in domcaps:
1106 struct _qemuMonitorCPUDefInfo {
1107 virDomainCapsCPUUsable usable;
1108 char *name;
1109 char *type;
1110 char **blockers; /* NULL-terminated string list */
1111 };

This is filled formerly by:
virQEMUCapsFetchCPUModels
-> virQEMUCapsFetchCPUDefinitions
  -> qemuMonitorGetCPUDefinitions - pokes qemu monitor to get info
-> virQEMUCapsCPUDefsToModels - adds probed models to structure

And that uses the common QMP query-cpu-definitions
So libvirt reports what qemu tells it ... oh I get a feeling I actually know this case ...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Ok it is different, when spawning my focal guest I used my usual LXD profile (which can run KVM) but the bionic guest to test this was from scratch and didn't have that. So I actually checked a non-kvm capable bionic system.

I have to retest the list in comment #6

When doing so we can almost right away go for qemu probing:
$ sudo qemu-system-x86_64 --enable-kvm --nographic --nodefaults -S -qmp-pretty stdio
{"execute":"qmp_capabilities"}
{"execute":"query-cpu-definitions"}

That will list all known CPUs and if they are not usable for e.g. missing a CPU feature

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Recreated the list with a better container :-/

#1 Bionic - as-is (libvirt 4.0 qemu 2.11) - Container -> Big List
#2 Bionic + Stein (libvirt 5.0 qemu 2.11) - Container ->
#3 Bionic + Train (libvirt 5.4 qemu 2.11) - Container ->
#4 Bionic + Focal-Binaries (libvirt 6.0 qemu 4.2) - Container ->
#5 Bionic - as-is (libvirt 4.0 qemu 2.11) - on Host -> Big List
#6 Focal - (libvirt 6.0 qemu 4.2) - Container - -> Big List

All confusion gone!
Ok that were a lot of checks to realize a simple truth.
The Eoan container I had initially thought I see your problem had no KVM as well :-/

But that might still help us as now I know rather exactly what to ask for.
Maybe your system doesn't detect KVM or some cpu flags correctly.

1. please provide the output of
 $ kvm-ok
2. please provide the output of
  $ sudo qemu-system-x86_64 --enable-kvm --nographic --nodefaults -S -qmp-pretty stdio
  {"execute":"qmp_capabilities"}
  {"execute":"query-cpu-definitions"}

In my systems case I can't select Skylake as well and the report tells me:
        {
            "name": "Skylake-Client-IBRS",
            "typename": "Skylake-Client-IBRS-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },

Which is a known limitation due to https://people.canonical.com/~ubuntu-security/cve/2019/CVE-2019-11135.html
As discussed in https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1853200 this is a Won't Fix at the moment. But we could bump that bug and report this case "symptom" to reconsider adding e.g. a reduced Skylake type.

Waiting for your data first ...

Changed in qemu (Ubuntu Eoan):
status: New → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I'll keep this bug to your specific case, while I'll bump bug 1853200 to add -noTSX cpu maps to libvirt as needed going forward.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This needs the info I asked for in comment #8 to go forward.
Marking tasks incomplete for now.

Changed in qemu (Ubuntu):
status: New → Fix Released
Changed in qemu (Ubuntu Eoan):
status: Confirmed → Incomplete
Changed in libvirt (Ubuntu Eoan):
status: Confirmed → Incomplete
Revision history for this message
Enoch Leung (leun0036) wrote :

Here's the other machine's info as requested with AMD CPU.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Wow the amd case even has all the models as usable='no'.
I still miss the qmp probing matching the XML output.

Please again see comment #8 and provide that as hopefully that will uncover why so many are "unusable" for you.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (22.8 KiB)

Thanks Rafael - from another Eoan AMD system:
<capabilities>

  <host>
...
    <cpu>
      <arch>x86_64</arch>
      <model>Opteron_G5</model>
...

<domainCapabilities>
...

    <mode name='custom' supported='yes'>
      <model usable='no'>qemu64</model>
      <model usable='yes'>qemu32</model>
      <model usable='no'>phenom</model>
      <model usable='yes'>pentium3</model>
      <model usable='yes'>pentium2</model>
      <model usable='yes'>pentium</model>
      <model usable='no'>n270</model>
      <model usable='yes'>kvm64</model>
      <model usable='yes'>kvm32</model>
      <model usable='no'>coreduo</model>
      <model usable='no'>core2duo</model>
      <model usable='no'>athlon</model>
      <model usable='no'>Westmere-IBRS</model>
      <model usable='yes'>Westmere</model>
      <model usable='no'>Skylake-Server-IBRS</model>
      <model usable='no'>Skylake-Server</model>
      <model usable='no'>Skylake-Client-IBRS</model>
      <model usable='no'>Skylake-Client</model>
      <model usable='no'>SandyBridge-IBRS</model>
      <model usable='no'>SandyBridge</model>
      <model usable='yes'>Penryn</model>
      <model usable='yes'>Opteron_G5</model>
      <model usable='yes'>Opteron_G4</model>
      <model usable='yes'>Opteron_G3</model>
      <model usable='yes'>Opteron_G2</model>
      <model usable='yes'>Opteron_G1</model>
      <model usable='no'>Nehalem-IBRS</model>
      <model usable='yes'>Nehalem</model>
      <model usable='no'>IvyBridge-IBRS</model>
      <model usable='no'>IvyBridge</model>
      <model usable='no'>Icelake-Server</model>
      <model usable='no'>Icelake-Client</model>
      <model usable='no'>Haswell-noTSX-IBRS</model>
      <model usable='no'>Haswell-noTSX</model>
      <model usable='no'>Haswell-IBRS</model>
      <model usable='no'>Haswell</model>
      <model usable='no'>EPYC-IBPB</model>
      <model usable='no'>EPYC</model>
      <model usable='yes'>Conroe</model>
      <model usable='no'>Cascadelake-Server</model>
      <model usable='no'>Broadwell-noTSX-IBRS</model>
      <model usable='no'>Broadwell-noTSX</model>
      <model usable='no'>Broadwell-IBRS</model>
      <model usable='no'>Broadwell</model>
      <model usable='yes'>486</model>
    </mode>

And the matching probe reflects just that:

{
    "QMP": {
        "version": {
            "qemu": {
                "micro": 0,
                "minor": 0,
                "major": 4
            },
            "package": "Debian 1:4.0+dfsg-0ubuntu9.4"
        },
        "capabilities": [
            "oob"
        ]
    }
}
{
    "return": {
    }
}
{
    "return": [
        {
            "name": "max",
            "typename": "max-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": false
        },
        {
            "name": "host",
            "typename": "host-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": false
        },
        {
            "name": "base",
            "typename": "base-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": true,
          ...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Enoch - I'd really like to see your probing result.
Actually the instructions in comment #9 are better as it also includes kvm-ok.
Please run those and report back here.

Revision history for this message
Enoch Leung (leun0036) wrote :

sorry for delays.

for both of my machines, kvm-ok =>
============================
INFO: /dev/kvm exists
KVM acceleration can be used
============================

and for QMP output, pls. see attachments.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thank Enoch, that indeed is bug 1853200 then which I fixed in Focal.

Security Team is considering to backport the new CPU types that can work without hle/rtm, but I've seen nothing move yet in that regard.

Marking as a dup.

Revision history for this message
Brian Murray (brian-murray) wrote :

The Eoan Ermine has reached end of life, so this bug will not be fixed for that release

Changed in libvirt (Ubuntu Eoan):
status: Incomplete → Won't Fix
Changed in qemu (Ubuntu Eoan):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.