Fix (+follow-up) needed for SEV-SNP vulnerability

Bug #2013198 reported by Khaled El Mously
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
Kinetic
Fix Committed
Medium
Unassigned
linux-gcp (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned

Bug Description

From email discussions with Dionna Glazee from Google:

> This email details a critical vulnerability in SEV-SNP attestation
> report integrity protection that must be patched in SEV-SNP-enabled
> kernels.
>
> I'm reaching out since I've been tracking our progress towards a
> stable offering of customer access to SEV-SNP "guest requests". I'd
> like to know how or if y'all test the /dev/sev-guest driver.
>
> The reason I ask is because our host KVM injects failures into the
> guest if requests come too frequently. Test suites that request
> attestation reports in quick succession will fail without very recent
> patches or workaround code in user space.
>
> Technical details, tl;dr
> * Nov 21, 2022: Linux Kernel 6.1 included a security patch 47894e0fa
> that will cause attestation to fail frequently (in GCE). Peter found
> and patched this vulnerability.
>
> Details of security patch 47894e0fa:
> This patch to sev-guest causes more fail-closed situations. All VMM
> errors other than INVALID_LEN will wipe out the VMPCK and close the
> guest's ability to communicate with the security processor.
> Ratelimit failures will also cause a fail-closed situation.
>
> As you may know, guest requests are encrypted by the guest with
> AES_GCM (not AES_GCM_SIV) and then passed through unencrypted memory
> to the host's KVM. KVM forwards that to the crypto/ccp driver to
> deliver to the AMD secure processor to respond to. When the VMM
> returns an error instead of forwarding a request to the secure
> processor, then the guest driver *does not* increment its IV. It can
> therefore reuse an IV on multiple messages with different contents.
> This breaks AES_GCM's security guarantees.
>
> Ratelimiting looks to the guest not as a stalled vCPU, but rather a
> special error response that AMD will include in their next published
> version of the GHCB protocol (I believe v2.02). This allows the guest
> VM to schedule other threads and remain productive while waiting up to
> 2 seconds for a request to be serviced. The special error code to an
> unpatched kernel is just forwarded to the guest as an EIO. User space
> may continue to issue requests, even if it is unsafe to do so.

no longer affects: linux-gcp (Ubuntu)
description: updated
no longer affects: linux-oracle (Ubuntu)
no longer affects: linux-oracle (Ubuntu Jammy)
no longer affects: linux-oracle (Ubuntu Kinetic)
no longer affects: linux-oracle (Ubuntu Lunar)
no longer affects: linux-gcp (Ubuntu Kinetic)
no longer affects: linux-gcp (Ubuntu Lunar)
no longer affects: linux-gcp (Ubuntu Kinetic)
no longer affects: linux-gcp (Ubuntu Lunar)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2013198

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
no longer affects: linux (Ubuntu Lunar)
Stefan Bader (smb)
Changed in linux (Ubuntu Kinetic):
importance: Undecided → Medium
Changed in linux (Ubuntu Kinetic):
status: New → Fix Committed
Changed in linux-gcp (Ubuntu Jammy):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp/5.19.0-1024.26 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-gcp verification-needed-kinetic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (72.3 KiB)

This bug was fixed in the package linux-gcp - 5.19.0-1024.26

---------------
linux-gcp (5.19.0-1024.26) kinetic; urgency=medium

  * kinetic/linux-gcp: 5.19.0-1024.26 -proposed tracker (LP: #2016490)

  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts

  * Fix (+follow-up) needed for SEV-SNP vulnerability (LP: #2013198)
    - virt/coco/sev-guest: Add throttling awareness

  * Miscellaneous Ubuntu changes
    - [config] Set SEV_GUEST back to =y

  [ Ubuntu: 5.19.0-42.43 ]

  * kinetic/linux: 5.19.0-42.43 -proposed tracker (LP: #2016503)
  * selftest: fib_tests: Always cleanup before exit (LP: #2015956)
    - selftest: fib_tests: Always cleanup before exit
  * Debian autoreconstruct Fix restoration of execute permissions (LP: #2015498)
    - [Debian] autoreconstruct - fix restoration of execute permissions
  * Kinetic update: upstream stable patchset 2023-04-10 (LP: #2015812)
    - drm/etnaviv: don't truncate physical page address
    - wifi: rtl8xxxu: gen2: Turn on the rate control
    - drm/edid: Fix minimum bpc supported with DSC1.2 for HDMI sink
    - clk: mxl: Switch from direct readl/writel based IO to regmap based IO
    - clk: mxl: Remove redundant spinlocks
    - clk: mxl: Add option to override gate clks
    - clk: mxl: Fix a clk entry by adding relevant flags
    - powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G
    - clk: mxl: syscon_node_to_regmap() returns error pointers
    - random: always mix cycle counter in add_latent_entropy()
    - KVM: x86: Fail emulation during EMULTYPE_SKIP on any exception
    - KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid
    - can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
    - powerpc: dts: t208x: Disable 10G on MAC1 and MAC2
    - powerpc/vmlinux.lds: Ensure STRICT_ALIGN_SIZE is at least page aligned
    - powerpc/64s/radix: Fix RWX mapping with relocated kernel
    - uaccess: Add speculation barrier to copy_from_user()
    - wifi: mwifiex: Add missing compatible string for SD8787
    - audit: update the mailing list in MAINTAINERS
    - ext4: Fix function prototype mismatch for ext4_feat_ktype
    - Revert "net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo
      child qdiscs"
    - bpf: add missing header file include
    - wifi: ath11k: fix warning in dma_free_coherent() of memory chunks while
      recovery
    - sched/psi: Stop relying on timer_pending() for poll_work rescheduling
    - docs: perf: Fix PMU instance name of hisi-pcie-pmu
    - randstruct: disable Clang 15 support
    - ionic: refactor use of ionic_rx_fill()
    - Fix XFRM-I support for nested ESP tunnels
    - arm64: dts: rockchip: drop unused LED mode property from rk3328-roc-cc
    - ARM: dts: rockchip: add power-domains property to dp node on rk3288
    - HID: elecom: add support for TrackBall 056E:011C
    - ACPI: NFIT: fix a potential deadlock during NFIT teardown
    - btrfs: send: limit number of clones and allocated memory size
    - ASoC: rt715-sdca: fix clock stop prepare timeout issue
    - IB/hfi1: Assign npages earlier
    - neigh: make sure used and confirmed times are valid
    - HID: core: Fix deadloop in hid_apply_multiplier.
    - x86/c...

Changed in linux-gcp (Ubuntu):
status: New → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp/5.15.0-1036.44 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-gcp verification-needed-jammy
tags: added: verification-done-jammy verification-done-kinetic
removed: verification-needed-jammy verification-needed-kinetic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (40.2 KiB)

This bug was fixed in the package linux-gcp - 5.15.0-1036.44

---------------
linux-gcp (5.15.0-1036.44) jammy; urgency=medium

  * jammy/linux-gcp: 5.15.0-1036.44 -proposed tracker (LP: #2019389)

  * Use new annotations model (LP: #2019000)
    - [Config] migrate all configs into annotations

  * Fix (+follow-up) needed for SEV-SNP vulnerability (LP: #2013198)
    - virt/sev-guest: Prevent IV reuse in the SNP guest driver
    - virt/coco/sev-guest: Add throttling awareness

  [ Ubuntu: 5.15.0-74.81 ]

  * jammy/linux: 5.15.0-74.81 -proposed tracker (LP: #2019420)
  * smartpqi: Update 22.04 driver to include recent bug fixes and support
    current generation devices (LP: #1998643)
    - scsi: smartpqi: Switch to attribute groups
    - scsi: smartpqi: Fix rmmod stack trace
    - scsi: smartpqi: Add PCI IDs
    - scsi: smartpqi: Enable SATA NCQ priority in sysfs
    - scsi: smartpqi: Eliminate drive spin down on warm boot
    - scsi: smartpqi: Quickly propagate path failures to SCSI midlayer
    - scsi: smartpqi: Fix a name typo and cleanup code
    - scsi: smartpqi: Fix a typo in func pqi_aio_submit_io()
    - scsi: smartpqi: Resolve delay issue with PQI_HZ value
    - scsi: smartpqi: Avoid drive spin-down during suspend
    - scsi: smartpqi: Update volume size after expansion
    - scsi: smartpqi: Speed up RAID 10 sequential reads
    - scsi: smartpqi: Expose SAS address for SATA drives
    - scsi: smartpqi: Fix NUMA node not updated during init
    - scsi: smartpqi: Fix BUILD_BUG_ON() statements
    - scsi: smartpqi: Fix hibernate and suspend
    - scsi: smartpqi: Fix lsscsi -t SAS addresses
    - scsi: smartpqi: Update version to 2.1.14-035
    - scsi: smartpqi: Fix unused variable pqi_pm_ops for clang
    - scsi: smartpqi: Stop using the SCSI pointer
    - scsi: smartpqi: Fix typo in comment
    - scsi: smartpqi: Shorten drive visibility after removal
    - scsi: smartpqi: Add controller fw version to console log
    - scsi: smartpqi: Add PCI IDs for ramaxel controllers
    - scsi: smartpqi: Close write read holes
    - scsi: smartpqi: Add driver support for multi-LUN devices
    - scsi: smartpqi: Fix PCI control linkdown system hang
    - scsi: smartpqi: Add PCI ID for Adaptec SmartHBA 2100-8i
    - scsi: smartpqi: Add PCI IDs for Lenovo controllers
    - scsi: smartpqi: Stop logging spurious PQI reset failures
    - scsi: smartpqi: Fix RAID map race condition
    - scsi: smartpqi: Add module param to disable managed ints
    - scsi: smartpqi: Update deleting a LUN via sysfs
    - scsi: smartpqi: Add ctrl ready timeout module parameter
    - scsi: smartpqi: Update copyright to current year
    - scsi: smartpqi: Update version to 2.1.18-045
    - scsi: smartpqi: Convert to host_tagset
    - scsi: smartpqi: Add new controller PCI IDs
    - scsi: smartpqi: Correct max LUN number
    - scsi: smartpqi: Change sysfs raid_level attribute to N/A for controllers
    - scsi: smartpqi: Correct device removal for multi-actuator devices
    - scsi: smartpqi: Add controller cache flush during rmmod
    - scsi: smartpqi: Initialize feature section info
    - scsi: smartpqi: Change version to 2.1.20-035
  * CVE-2023-32233
    - netfilter: nf_tables: de...

Changed in linux-gcp (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.19.0-47.49 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux verification-needed-kinetic
removed: verification-done-kinetic
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.