Guests using IBRS incur a large performance penalty

Bug #1764956 reported by Vincent Bernat on 2018-04-18
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Gavin Guo
Trusty
Undecided
Unassigned
Xenial
High
Gavin Guo

Bug Description

[Impact]
the IBRS would be mistakenly enabled in the host when the switching
from an IBRS-enabled VM and that causes the performance overhead in
the host. The other condition could also mistakenly disables the IBRS
in VM when context-switching from the host. And this could be
considered a CVE host.

[Fix]
The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
x86_spec_ctrl_base by default is zero. Because the upstream
implementation is not equal to the Xenial's implementation. Upstream
doesn't use the IBRS as the formal fix. So, by default, it's zero.

On the other hand, after the VM exit, the SPEC_CTRL register also
needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
intercept is disabled by default in the hardware_setup(v4.4) and
vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
doesn't trigger a trap. So, the vmx_set_msr() function isn't called.

The v3.13 kernel hasn't been tested. However, the patch can be viewed
at:
http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru

The v4.4 patch:
http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg

[Test]

The patch has been tested on the 4.4.0-140.166 and works fine.

The reproducing environment:
Guest kernel version: 4.4.0-138.164
Host kernel version: 4.4.0-140.166

(host IBRS, guest IBRS)

- 1). (0, 1).
The case can be reproduced by the following instructions:
guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
1

<Several minutes later...>

host$ cat /proc/sys/kernel/ibrs_enabled
0
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111000000000000000000010010100000000000000000

Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
enabled.

host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
stress-ng: info: [11264] defaulting to a 86400 second run per stressor
stress-ng: info: [11264] dispatching hogs: 1 cpu
stress-ng: info: [11264] cache allocate: default cache size: 35840K
stress-ng: info: [11264] successful run completed in 33.48s

The host kernel didn't notice the IBRS bit is enabled. So, the situation
is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
And running the stress-ng is a pure userspace CPU capability
calculation. So, the performance downgrades to about 1/3. Without the
IBRS enabled, it needs about 10s.

- 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
The guest IBRS has been mistakenly disabled.

guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111

host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111
host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

CVE References

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1764956

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Vincent Bernat (vbernat) wrote :

No log files to provide.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
status: New → Triaged
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
tags: added: kernel-da-key xenial
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Vincent, thanks for your report - really good research, even indicating a potential fix.
I manage to reproduce this issue using the following steps:

1) Be sure to run Ubuntu 16.04 on both guest and host, both
running kernel 4.4.0 (could be latest version of this kernel)

2) In the guest, install the package "msr-tools", like:
"sudo apt install msr-tools"

3) Still on guest, run: "sudo modprobe msr; sudo wrmsr 0x48 1"
This will enable the MSR bit for IBRS inside the guest

4) Now the host shows bad performance

If we run "sudo wrmsr 0x48 0" in the guest, host gets its performance back.

I'll investigate some commits upstream, including the one you suggested, and once we figure the exact fix for this, will request SRU to the kernel team.
Thanks,

Guilherme

Changed in linux (Ubuntu):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
tags: added: sts
Guilherme G. Piccoli (gpiccoli) wrote :

Gavin Guo had provided a patch to fix this issue, it's a SRU candidate: https://lists.ubuntu.com/archives/kernel-team/2018-November/096844.html

I'll update this issue accordingly, thanks Gavin!
Cheers,

Guilherme

Changed in linux (Ubuntu):
importance: Medium → High
Changed in linux (Ubuntu Xenial):
importance: Medium → High
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Xenial):
status: Triaged → In Progress
Guilherme G. Piccoli (gpiccoli) wrote :

Vincent, I'll update the description to match the SRU requirements, but your original description will still be available.

description: updated
Gavin Guo (mimi0213kimo) on 2018-11-29
Changed in linux (Ubuntu Xenial):
assignee: Guilherme G. Piccoli (gpiccoli) → Gavin Guo (mimi0213kimo)
Changed in linux (Ubuntu):
assignee: Guilherme G. Piccoli (gpiccoli) → Gavin Guo (mimi0213kimo)
Changed in linux (Ubuntu Trusty):
status: New → In Progress
Juerg Haefliger (juergh) wrote :

The update to stable 4.4.169 will fix the majority of the issue.

Stefan Bader (smb) on 2019-02-01
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Terry Rudd (terrykrudd) wrote :

Final reminder: We are at the end of the SRU Cycle and request that you please provide verification the kernel in proposed resolves the problem for which this bug was submitted. -Thank you!

Guilherme G. Piccoli (gpiccoli) wrote :

I've managed to verify this on kernel 4.4.0-143-generic, the problem isn't reproducible anymore.

Worth noticing that when using the kernel 4.4.0-143 in both host and guest, we cannot change MSR bit anymore from within the guest, so the performance drop is not observed.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (16.2 KiB)

This bug was fixed in the package linux - 4.4.0-143.169

---------------
linux (4.4.0-143.169) xenial; urgency=medium

  * linux: 4.4.0-143.169 -proposed tracker (LP: #1814647)

  * x86/kvm: Backport fixup and missing commits (LP: #1811646)
    - KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
    - kvm: nVMX: VMCLEAR an active shadow VMCS after last use
    - X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs
    - KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR
      path as unlikely()
    - kvm: x86: IA32_ARCH_CAPABILITIES is always supported
    - KVM: SVM: Add MSR-based feature support for serializing LFENCE
    - KVM: X86: Allow userspace to define the microcode version
    - KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
    - KVM: VMX: fixes for vmentry_l1d_flush module parameter
    - kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
    - kvm: vmx: Scrub hardware GPRs at VM-exit
    - SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic
    - SAUCE: KVM: Move code fragments, cleanup and re-indent

  * linux-buildinfo: pull out ABI information into its own package
    (LP: #1806380)
    - [Packaging] limit preparation to linux-libc-dev in headers
    - [Packaging] commonise debhelper invocation
    - [Packaging] ABI -- accumulate abi information at the end of the build
    - [Packaging] buildinfo -- add basic build information
    - [Packaging] buildinfo -- add firmware information to the flavour ABI
    - [Packaging] buildinfo -- add compiler information to the flavour ABI
    - [Packaging] buildinfo -- add buildinfo support to getabis
    - [Config] buildinfo -- add retpoline version markers
    - [Packaging] getabis -- handle all known package combinations
    - [Packaging] getabis -- support parsing a simple version

  * signing: only install a signed kernel (LP: #1764794)
    - [Packaging] update to Debian like control scripts
    - [Packaging] switch to triggers for postinst.d postrm.d handling
    - [Packaging] signing -- switch to raw-signing tarballs
    - [Packaging] signing -- switch to linux-image as signed when available
    - [Packaging] printenv -- add signing options
    - [Packaging] fix invocation of header postinst hooks
    - [Packaging] signing -- add support for signing Opal kernel binaries
    - [Debian] Use src_pkg_name when constructing udeb control files
    - [Debian] Dynamically determine linux udebs package name
    - [Packaging] handle both linux-lts* and linux-hwe* as backports
    - [Config] linux-source-* is in the primary linux namespace
    - [Packaging] lookup the upstream tag
    - [Packaging] zfs/spl -- enhance provides information
    - [Packaging] switch up to debhelper 9
    - [Packaging] autopkgtest -- disable d-i when dropping flavours
    - [debian] support for ship_extras_package=false
    - [Debian] do_common_tools should always be on
    - [debian] do not force do_tools_common
    - [Packaging] Add linux-tools-host package for VM host tools
    - [Packaging] signing should be conditional
    - [Packaging] skip cloud tools packaging when not building package
    - [Packaging] add acpidbg
    - [debian] prep linu...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers