qemu-system-x86 missing ssbd flag in UCA Ocata

Bug #1846501 reported by Chris S
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
New
Undecided
Unassigned
qemu (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Specific package version qemu-system-x86==1:2.8+dfsg-3ubuntu2.9~cloud6

Upgrading to Ocata from Newton on Xenial results in a significant upgrade to libvirt (1.3.1 -> 2.5) and qemu (2.5 -> 2.8)

Instances created in Newton have access to the SSBD CPU flag in QEMU, however after the upgrade this feature is reported as missing, meaning that live-migrations are not possible.

The libvirt 2.5 package depends on QEMU >= 2.8

The SSBD patch looks to have been applied upstream, but does not seem to have made it in here:

https://launchpad.net/debian/+source/qemu/1:2.8+dfsg-6+deb9u5

Upgraded host output
====

virsh version
Compiled against library: libvirt 2.5.0
Using library: libvirt 2.5.0
Using API: QEMU 2.5.0
Running hypervisor: QEMU 2.8.0

qemu-system-x86_64 -cpu help | tail -20
Recognized CPUID flags:
  fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 pn clflush ds acpi mmx fxsr sse sse2 ss ht tm ia64 pbe
  pni pclmulqdq dtes64 monitor ds-cpl vmx smx est tm2 ssse3 cid fma cx16 xtpr pdcm pcid dca sse4.1 sse4.2 x2apic movbe popcnt tsc-deadline aes xsave osxsave avx f16c rdrand hypervisor
  fsgsbase tsc-adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512ifma pcommit clflushopt clwb avx512pf avx512er avx512cd avx512bw avx512vl
  avx512vbmi umip pku ospke rdpid
  avx512-4vnniw avx512-4fmaps md-clear spec-ctrl
  syscall nx mmxext fxsr-opt pdpe1gb rdtscp lm 3dnowext 3dnow
  lahf-lm cmp-legacy svm extapic cr8legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid-msr tbm topoext perfctr-core perfctr-nb
  invtsc
  ibpb
  xstore xstore-en xcrypt xcrypt-en ace2 ace2-en phe phe-en pmm pmm-en
  kvmclock kvm-nopiodelay kvm-mmu kvmclock kvm-asyncpf kvm-steal-time kvm-pv-eoi kvm-pv-unhalt kvmclock-stable-bit

  npt lbrv svm-lock nrip-save tsc-scale vmcb-clean flushbyasid decodeassists pause-filter pfthreshold
  xsaveopt xsavec xgetbv1 xsaves
  arat

Pre upgrade version:
====

virsh version
Compiled against library: libvirt 1.3.1
Using library: libvirt 1.3.1
Using API: QEMU 1.3.1
Running hypervisor: QEMU 2.5.0

qemu-system-x86_64 -cpu help | tail -20
x86 Opteron_G3 AMD Opteron 23xx (Gen 3 Class Opteron)
x86 Opteron_G4 AMD Opteron 62xx class CPU
x86 Opteron_G5 AMD Opteron 63xx class CPU
x86 host KVM processor with all supported host features (only available in KVM mode)

Recognized CPUID flags:
  fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 pn clflush ds acpi mmx fxsr sse sse2 ss ht tm ia64 pbe
  pni|sse3 pclmulqdq|pclmuldq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cid fma cx16 xtpr pdcm pcid dca sse4.1|sse4_1 sse4.2|sse4_2 x2apic movbe popcnt tsc-deadline aes xsave osxsave avx f16c rdrand hypervisor
  fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f rdseed adx smap pcommit clflushopt clwb avx512pf avx512er avx512cd

  md-clear spec-ctrl ssbd
  syscall nx|xd mmxext fxsr_opt|ffxsr pdpe1gb rdtscp lm|i64 3dnowext 3dnow
  lahf_lm cmp_legacy svm extapic cr8legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb
  invtsc
  ibpb virt-ssbd
  xstore xstore-en xcrypt xcrypt-en ace2 ace2-en phe phe-en pmm pmm-en
  kvmclock kvm_nopiodelay kvm_mmu kvmclock kvm_asyncpf kvm_steal_time kvm_pv_eoi kvm_pv_unhalt kvmclock-stable-bit
  npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pause_filter pfthreshold
  xsaveopt xsavec xgetbv1 xsaves
  arat

For the time being I am working around this problem by pinning libvirt and qemu at the older versions provided by xenial repositories, as they are within the versions for Ocata listed here https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qemu (Ubuntu):
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Possible related bugs:

https://bugs.launchpad.net/ubuntu/eoan/+source/libvirt/+bug/1828495
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1841066

TODOs being worked in there:

LP: #1828495 - [KVM][CLX] CPUID_7_0_EDX_ARCH_CAPABILITIES

https://bugs.launchpad.net/ubuntu/eoan/+source/libvirt/+bug/1828495
Backport libvirt patches to Bionic
Backport libvirt patches to Disco
Review/Discuss

LP: #1841066 - ARCH_CAPABILITIES guest capability detection

https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1841066
Backport qemu patches to Bionic
Backport qemu patches to Disco
Review/Discuss

* Create a PPA, test, etc, and then create a Merge.
* Check kernel needs as well (SEG has requested)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks Rafael for adding the bugs already that are related to this.

There also is (for an AMD POV):
"backport extended amd spectre mitigations":
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1840745

But this here is a very special case as it is explicitly asked for UCA Ocata.
The bugs above will cover active Ubuntu releases which atm are Xenial (whitch matches UCA mitaka), Bionic (matching UCA queens) and newer.
Ocata was based on Zesty which from the base distributions POV is no more active.

Therefore the Openstack Team needs to decide if they want to pick changes we have made for Ocata or encourage to move to Queens instead (if that is an option).

Already for Xenial we have decided to not backport all of these changes, as strictly speaking they are all just "optimizations" to get out of the drawbacks of spectre, meltdown and siblings.
The Openstack Team might decide similar for Ocata, but it is up to them so I'll add a task and assign it to them.

Changed in qemu (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Chris S (bloatyfloat) wrote :

From a users perspective this can be solved simply by not providing updated libvirt/qemu package in the UCA repositories - the versions provided in Xenial are sufficient to run nova, and are also going to get consistent updates during the lifespan of the release.

If a newer version is really desired then ideally I would suggest it is pulled from the next LTS release up, rather than one from a non LTS release - the version requirements in openstack for functionality don't change that much between releases, so there's plenty of warning, and again patching of security updates is handled by someone else.

Upgrading to Queens is not something that is immediately available to us, and would also require a double bump going via Pike, which I guess may also have its own libvirt/qemu deployments which I fear may have the same bugs. I appreciate that the control plane can be upgraded and use "[upgrade_levels]" to maintain compatability, and the hypervisors could potentially skip Pike (or maintain the held libvirt/qemu packages as I have done for Ocata).

My actual aim is to transition to a kolla based deployment, but the ocata image using Ubuntu binaries has the newer qemu and libvirt packages, and so we will be unable to migrate our hypervisors until a later release anyway with this issue.

Thanks for all the references and input here folks :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.