Manually generate the PDPTR reserved bit mask when explicitly loading
PDPTRs. The reserved bits that are being tracked by the MMU reflect the
current paging mode, which is unlikely to be PAE paging in the vast
majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
__set_sregs(), etc... This can cause KVM to incorrectly signal a bad
PDPTR, or more likely, miss a reserved bit check and subsequently fail
a VM-Enter due to a bad VMCS.GUEST_PDPTR.
Add a one off helper to generate the reserved bits instead of sharing
code across the MMU's calculations and the PDPTR emulation. The PDPTR
reserved bits are basically set in stone, and pushing a helper into
the MMU's calculation adds unnecessary complexity without improving
readability.
Oppurtunistically fix/update the comment for load_pdptrs().
Note, the buggy commit also introduced a deliberate functional change,
"Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
effectively (and correctly) reverted by commit cd9ae5fe47df ("KVM: x86:
Fix page-tables reserved bits"). A bit of SDM archaeology shows that
the SDM from late 2008 had a bug (likely a copy+paste error) where it
listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
for 2mb entries. I.e. the SDM contradicted itself, and bits 6:5 are and
always have been reserved.
Fixes: 20c466b56168d ("KVM: Use rsvd_bits_mask in load_pdptrs()")
Cc: <email address hidden>
Cc: Nadav Amit <email address hidden>
Reported-by: Doug Reiland <email address hidden>
Signed-off-by: Sean Christopherson <email address hidden>
Reviewed-by: Peter Xu <email address hidden>
Signed-off-by: Paolo Bonzini <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Signed-off-by: Kleber Sacilotto de Souza <email address hidden>
This one is also included in the 4.19.81 (or more correctly, it's there since
v4.19.77) with commit 496cf984a60edb5534118a596613cc9971e406e8 [0] or
upstream commit 16cfacc8085782dab8e365979356ce1ca87fd6cc [1].
Funny thing is: I cannot reproduce this with a 5.3.7 (Eoan) kernel, which _also_
includes above commit. So possible another patch is missing in the backport,
did not find anything obvious though...
So summary for reproducer:
* dust of an host with old Intel CPU, e.g.: Intel Core2Duo CPU E8500 @3.16GHz
(something else westmer, conroe or the like should work too, or if it's
released
over 10 years ago.
* Install a Linux Distro or just boot the installer of that in a VM, I used
Debian 9,
as our users had issues with that but *not* with an ubuntu 19.10 VM.
* see how it boot loops once a stable-kernel with above[0] backported
is used on the host
Mostly the same info as on a related kernel.org bugzilla entr[0].
[0]: https:/ /bugzilla. kernel. org/show_ bug.cgi? id=205441
We got issues reported with old Intel CPUs and Linux guest run with QEMU/KVM after a recent kernel update which is based on Ubuntu-5.0.0-33.35.
I bisected this here, with following result: 73f80ea55b72bb2 966a246167f] UBUNTU: Ubuntu-5.0.0-33.35 962297fae609bff 487de3cc43a] UBUNTU: Ubuntu-5.0.0-30.32 73f80ea55b72bb2 966a246167f' 962297fae609bff 487de3cc43a' b166800f8936bee f153fab736e] net/ibmvnic: free reset work 166800f8936beef 153fab736e 3886b44ce5b6f37 4893b95e369] arm64: tlb: Ensure we execute 886b44ce5b6f374 893b95e369 c95f9e374d69aea d7f1498877b] loop: Add LOOP_SET_DIRECT_IO 95f9e374d69aead 7f1498877b 67ec580b454afdd 8b49883df2f] libata/ahci: Drop PCS quirk 7ec580b454afdd8 b49883df2f 928f8df049ffbb6 b4a0be136ae] PM / devfreq: passive: fix 28f8df049ffbb6b 4a0be136ae 6aea931094b3186 4979d7f8102] scsi: implement .cleanup_rq aea931094b31864 979d7f8102 8cbd4352c65a20e 57d16f9f936] media: sn9c20x: Add MSI cbd4352c65a20e5 7d16f9f936 739ced9ff42b6f2 02f8f802c72] parisc: Disable HP HSC-PCI 39ced9ff42b6f20 2f8f802c72 629070329488d3c 6a3e142602b] KVM: x86: set exception in x86_decode_insn() 29070329488d3c6 a3e142602b a57b0c0a3c18014 2a521594876] KVM: x86: Manually calculate 57b0c0a3c180142 a521594876 a57b0c0a3c18014 2a521594876] KVM: x86:
git bisect log
# bad: [3b931173c97b0d
# good: [5d5a6b36e94909
git bisect start '3b931173c97b0d
'5d5a6b36e94909
# good: [7b4f844b33969a
of removed device from queue
git bisect good 7b4f844b33969ab
# bad: [6c1fc88702a4f3
an ISB following walk cache invalidation
git bisect bad 6c1fc88702a4f33
# good: [e627a027b54ecc
to compat ioctl
git bisect good e627a027b54eccc
# good: [29919eff6333bc
for Denverton and beyond
git bisect good 29919eff6333bc6
# good: [cb44193f94af73
compiler warning
git bisect good cb44193f94af739
# good: [b1d479b27b2696
callback
git bisect good b1d479b27b26966
# bad: [ec15813844b05d
MS-1039 laptop to flip_dmi_table
git bisect bad ec15813844b05d8
# good: [e83601f51a90d9
Cards to prevent kernel crash
git bisect good e83601f51a90d97
# good: [6d393bdf3b3f4b
ctxt->have_
git bisect good 6d393bdf3b3f4b6
# bad: [208007519a7385
reserved bits when loading PDPTRS
git bisect bad 208007519a7385a
# first bad commit: [208007519a7385
Manually calculate reserved bits when loading PDPTRS
Which is:
KVM: x86: Manually calculate reserved bits when loading PDPTRS
BugLink: https:/ /bugs.launchpad .net/bugs/ 1848367
commit 16cfacc8085782d ab8e365979356ce 1ca87fd6cc upstream.
Manually generate the PDPTR reserved bit mask when explicitly loading
PDPTRs. The reserved bits that are being tracked by the MMU reflect the
current paging mode, which is unlikely to be PAE paging in the vast
majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
__set_sregs(), etc... This can cause KVM to incorrectly signal a bad
PDPTR, or more likely, miss a reserved bit check and subsequently fail
a VM-Enter due to a bad VMCS.GUEST_PDPTR.
Add a one off helper to generate the reserved bits instead of sharing
code across the MMU's calculations and the PDPTR emulation. The PDPTR
reserved bits are basically set in stone, and pushing a helper into
the MMU's calculation adds unnecessary complexity without improving
readability.
Oppurtunist ically fix/update the comment for load_pdptrs().
Note, the buggy commit also introduced a deliberate functional change,
"Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
effectively (and correctly) reverted by commit cd9ae5fe47df ("KVM: x86:
Fix page-tables reserved bits"). A bit of SDM archaeology shows that
the SDM from late 2008 had a bug (likely a copy+paste error) where it
listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
for 2mb entries. I.e. the SDM contradicted itself, and bits 6:5 are and
always have been reserved.
Fixes: 20c466b56168d ("KVM: Use rsvd_bits_mask in load_pdptrs()")
Cc: <email address hidden>
Cc: Nadav Amit <email address hidden>
Reported-by: Doug Reiland <email address hidden>
Signed-off-by: Sean Christopherson <email address hidden>
Reviewed-by: Peter Xu <email address hidden>
Signed-off-by: Paolo Bonzini <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Signed-off-by: Kleber Sacilotto de Souza <email address hidden>
This one is also included in the 4.19.81 (or more correctly, it's there since 534118a596613cc 9971e406e8 [0] or ab8e365979356ce 1ca87fd6cc [1].
v4.19.77) with commit 496cf984a60edb5
upstream commit 16cfacc8085782d
[0]: /git.kernel. org/pub/ scm/linux/ kernel/ git/stable/ linux-stable- rc.git/ commit/ ?h=v4.19. 82&id=496cf984a 60edb5534118a59 6613cc9971e406e 8 /git.kernel. org/torvalds/ c/16cfacc808578 2dab8e365979356 ce1ca87fd6cc
https:/
[1]: https:/
Funny thing is: I cannot reproduce this with a 5.3.7 (Eoan) kernel, which _also_
includes above commit. So possible another patch is missing in the backport,
did not find anything obvious though...
So summary for reproducer:
* dust of an host with old Intel CPU, e.g.: Intel Core2Duo CPU E8500 @3.16GHz
(something else westmer, conroe or the like should work too, or if it's
released
over 10 years ago.
* Install a Linux Distro or just boot the installer of that in a VM, I used
Debian 9,
as our users had issues with that but *not* with an ubuntu 19.10 VM.
* see how it boot loops once a stable-kernel with above[0] backported
is used on the host