Of course I spoke too soon, on T3.13/Q2.0/B4.15 I now hit an FPU issue.
That builds up to a kernel stack crash (recursive)
[ 2.394255] Bad FPU state detected at fpu__clear+0x6b/0xd0, reinitializing FPU registers.
[...]
BUG: stack guard page was hit at (ptrval) (stack is (ptrval).. (ptrval))
That is again elated to MSR handling.
So disabling a few but keeping MDS as needed for this test helps:
<cpu mode='custom' match='exact'>
<model fallback='allow'>kvm64</model>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='md-clear'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='pcid'/>
</cpu>
You have not really tested with 3.13 at the LVL1 as far as I read your updates.
I'm expecting that even 3.13 -> 4.4 already has quite some nested fixes that made this "better but not perfect" - so you haven't seen it.
Have I already said that nested KVM on x86 can be unreliable?
Of course I spoke too soon, on T3.13/Q2.0/B4.15 I now hit an FPU issue.
That builds up to a kernel stack crash (recursive)
[ 2.394255] Bad FPU state detected at fpu__clear+ 0x6b/0xd0, reinitializing FPU registers.
[...]
BUG: stack guard page was hit at (ptrval) (stack is (ptrval).. (ptrval))
That is again elated to MSR handling. 'allow' >kvm64< /model>
So disabling a few but keeping MDS as needed for this test helps:
<cpu mode='custom' match='exact'>
<model fallback=
<feature policy='require' name='ssbd'/>
<feature policy='require' name='md-clear'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='pcid'/>
</cpu>
You have not really tested with 3.13 at the LVL1 as far as I read your updates.
I'm expecting that even 3.13 -> 4.4 already has quite some nested fixes that made this "better but not perfect" - so you haven't seen it.
Have I already said that nested KVM on x86 can be unreliable?