Comment 31 for bug 1829555

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Theory given what we know so far:
- only fails if LVL1 is at 4.4
- not failing if LVL1 is at 3.13
- 4.4 might have more CPU features
- qemu 2.0 when using host-model is passing ALL features
- qemu 2.5 works, but we now know it filters some flags that 2.0 doesn't
=> one of these extra flags disturbs the guests bug detection

Check extra flags in LVL1 between 3.13 and 4.4

3.13 -> 4.4 has in addition (Host):
> clflushopt
> kaiser
> mpx
> tsc_known_freq
> xgetbv1
> xsavec
< eagerfpu

Comparing LVL2 between case 07 and 10
< arch_capabilities
> arat

So interestingly, none of the flags that are added on 4.4 on LVL1 show up in the guest.
But one more that also seems interesting is showing up "arch_capabilities".

I haven't found a good way to control arch_capabilities yet.
It is part of the Spectre backports actually like [1] - I haven't seen it like that in the code that you added to qemu 2.0 but it is at least related.

So the LVL1 4.4 has some empty flags/features that the older qemu 2.0 does not filter and hence the guest gets an broken MSR for MSR_IA32_ARCH_CAPABILITIES.
That is what breaks the guests.

Given that:
- nested (especially in these much older versions of KVM/Qemu) is not very well supported
- this issue seems to depend on other security fixes (in the 4.4 kernel)
- qemu 2.0 is out in ESM, and this is not a fix required for that

I'd call it confirmed but prio wishlist and probably, unless convinced won't work on it for now.

I hope the analysis helps if e.g. the security Team wants to take a look at all MSR_IA32_ARCH_CAPABILITIES related changes. One could e-g- actually read CPUID_7_0_EDX_ARCH_CAPABILITIES in the LVL2 guest that is broken. I'm rather sure it has malformed or incomplete content.

[1]: https://lwn.net/Articles/746119/