Alright. I had in mind that (for bionic/cosmic) but we definitely have to mention what you said (for started guests). About the migration, will cause the issue and report back after cosmic/disco backport.
About needed CPU features. I could get the following... these are the needed cpuflags to have a good mitigation status (and opt out from some heavy workarounds/mitigations inside guest):
I guess that we will have to backport this support in libvirt, in order to allow QEMU to pick specific CPU mitigation flags.
----
This is the difference from executing QEMU with passthrough versus specifying the Cascadelake Server WITHOUT picking up specific flags (as they are unsupported):
----
HOST has MD_CLEAR, CPU type not:
* VERW instruction is available: YES (MD_CLEAR feature bit)
* CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
* CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
HOST reports not being vulnerable to:
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): NO
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): NO
----
Passing the flags (ibrs_enhanced):
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (IBRS + IBPB are mitigating the vulnerability)
Not passing the flags (ibrs_enhanced):
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: disabled, RSB filling)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for firmware code only)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
----
Passing the flags (from arch_capabilities MSR (INVPCID)):
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: NO
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
Not passing the flags:
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
----
Passing the flags (from arch_capabilities MSR):
CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
Not passing the flags:
CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion)
----
Passing the flags (from arch_capabilities MSR):
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in kernel image)
* L1D flush enabled: NO
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
Not passing the flags:
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Mitigation: PTE Inversion
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: N/A (the kvm_intel module is not loaded)
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in kernel image)
* L1D flush enabled: UNKNOWN (unrecognized mode)
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
Alright. I had in mind that (for bionic/cosmic) but we definitely have to mention what you said (for started guests). About the migration, will cause the issue and report back after cosmic/disco backport.
About needed CPU features. I could get the following... these are the needed cpuflags to have a good mitigation status (and opt out from some heavy workarounds/ mitigations inside guest):
arch_capabiliti es=on vmentry= yes
ssbd=on
md-clear=on,
wbnoinvd=off
bpb=off
virt-ssbd=off
rdctl-no=yes
ibrs-all=yes
rsba=yes
skip-l1dfl-
ssb-no=yes
As we've discussed, I backported EOAN's libvirt 5.4.0-0ubuntu2 to Bionic and checked support...
This is the difference when starting CPU as passthrough and CPU as CascadeLake:
$ diff -y one two | grep -E "(>|<)"
> pti
arch_capabilities <
arch_perfmon <
ept <
flexpriority <
ibrs_enhanced <
md_clear <
ss <
tpr_shadow <
tsc_adjust <
umip <
vmx <
vnmi <
vpid <
xsaves <
xtopology <
As you can see we have to enable the CPU features in order to fully inform guest about HW mitigations and such. Going a bit deeper:
INTEL:
arch-capabilities
invpcid
<cpu mode='custom' match='exact' check='partial'> 'allow' >Cascadelake- Server< /model> capabilities' />
<model fallback=
<feature policy='require' name='arch-
<feature policy='require' name='invpcid'/>
</cpu>
AMD ONLY:
wbnoinvd
ibpb
virt-ssbd
saw support in libvirt XML files.
COULD NOT FIND specific support for the following CPU features:
ssbd
md-clear
bpb
ibrs-all
rdctl-no
rsba
skip-l1dfl-vmentry
I guess that we will have to backport this support in libvirt, in order to allow QEMU to pick specific CPU mitigation flags.
----
This is the difference from executing QEMU with passthrough versus specifying the Cascadelake Server WITHOUT picking up specific flags (as they are unsupported):
----
HOST has MD_CLEAR, CPU type not:
* VERW instruction is available: YES (MD_CLEAR feature bit)
HOST has arch_capabilities, CPU type not:
* CPU indicates ARCH_CAPABILITIES MSR availability: YES
* ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: YES
* CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
* CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
HOST reports not being vulnerable to:
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): NO
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): NO
----
Passing the flags (ibrs_enhanced):
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (IBRS + IBPB are mitigating the vulnerability)
Not passing the flags (ibrs_enhanced):
CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: disabled, RSB filling)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for firmware code only)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
----
Passing the flags (from arch_capabilities MSR (INVPCID)):
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: NO
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
Not passing the flags:
CVE-2017-5754 aka 'Variant 3, Meltdown, rogue data cache load'
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka 'Variant 3a, rogue system register read'
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
----
Passing the flags (from arch_capabilities MSR):
CVE-2018-3615 aka 'Foreshadow (SGX), L1 terminal fault'
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
Not passing the flags:
CVE-2018-3620 aka 'Foreshadow-NG (OS), L1 terminal fault'
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion)
----
Passing the flags (from arch_capabilities MSR):
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in kernel image)
* L1D flush enabled: NO
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
Not passing the flags:
CVE-2018-3646 aka 'Foreshadow-NG (VMM), L1 terminal fault'
* Information from the /sys interface: Mitigation: PTE Inversion
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: N/A (the kvm_intel module is not loaded)
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in kernel image)
* L1D flush enabled: UNKNOWN (unrecognized mode)
* Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
* Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)