[23.04 FEAT] KVM: Secure Execution guest dump encryption with customer keys - qemu part

Bug #1959966 reported by bugproxy
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Skipper Bug Screeners
qemu (Ubuntu)
Fix Released
High
Skipper Bug Screeners
Jammy
In Progress
Undecided
Sergio Durigan Junior

Bug Description

SRU Justification:

[ Impact ]

 * Hypervisor-initiated dumps for Secure Execution (aka confidential computing)
   guests are not helpful because memory and CPU state is encrypted by a
   transient key only available to the Ultravisor.

 * Workload owners can still configure kdump in order to obtain kernel crash
   information, but there are situation where kdump doesn't work.

 * In such situations problem determination is severely impeded.

 * This patch set solves this by implementing dumps created in a way
   that can only be decrypted by the owner of the guest image
   and be used for problem determination.

[ Test Plan ]

 * The setup of a Secure Execution environment is not trivial
   and requires a certain set of hardware (IBM z15 or higher)
   with FC 115).

 * On top of the modification of qemu that are handled in this
   LP bug, modifications of the Kernel (LP#1959940) and
   the s390-tools (LP#1959965) are required on top.

 * So at least a modified kernel and qemu test builds are needed
   or both should be in -proposed at the same time (which might
   be difficult).
   A modified s390-tools is not urgently needed, since for the
   verification of the kernel and qemu part a newer version
   can be used (but a modified s390-tools is also available in PPA).

 * A detailed description (using Ubuntu as example) on how to setup
   secure execution is available here:
   Introducing IBM Secure Execution for Linux, April 2024 update
   https://www.ibm.com/docs/en/linuxonibm/pdf/lx24se04.pdf

 * And information on 'Working with dumps of KVM guests in
   IBM Secure Execution mode' is available here:
   https://www.ibm.com/docs/en/linux-on-systems?topic=commands-zgetdump#czgetdump__se_dump_examples

[ Where problems could occur ]

 * Mainly dump code (dump/dump.c and include/sysemu/dump.h) is modified,
   which may lead to broken or incorrect dumps,
   also for non-secure-execution guests. (So testing of both is needed.)

 * Modifications in the elf header header handling
   as well as wrong hardware address and offset calculation can
   (in worst case) lead to unusable files.

 * Modification in dump state handling may cause issue generating
   the dump itself.

 * Modifications need to be endianess-aware, since this secure
   execution dump is for s390x - if not dumps become useless.

 * Functions for writing the header got modified (and split),
   which may lead to wrong headers (if done erroneously).

 * It's a big patch set in general, which may bring further unforeseen
   effects, but it's worth to mention that the code is upstream accepted
   since quite a while (qemu 7.2) and already included in Ubuntu
   since 23.04 and successfully in use.

 * On top the packages from the PPA test build were tested upfront.

[ Other Info ]

 * Since 22.04 is a popular LTS release, it is already in use by many
   secure execution customers.
   But in case of severe crashes or issues in the secure execution
   (KVM) guests dumps cannot be used as of today.

 * This enables customers, IBM and Canonical to get support in case of
   crashes/dumps on hardware that runs secure execution environments.
__________

KVM: Secure Execution guest dump encryption with customer keys - qemu part

Description:
Hypervisor-initiated dumps for Secure Execution guests are not helpful because memory and CPU state is encrypted by a transient key only available to the Ultravisor. Workload owners can still configure kdump in order to obtain kernel crash infomation, but there are situation where kdump doesn't work. In such situations problem determination is severely impeded. This feature will implement dumps created in a way that can only be decrypted by the owner of the guest image and be used for problem determination.

Request Type: Package - Update Version
Upstream Acceptance: In Progress
Code Contribution: IBM code

Related branches

bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-196317 severity-high targetmilestone-inin2204
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-02-03 17:25 EDT-------
This also has an kernel and s390-tools part:

IBM BZ 196316 - LP#1959940 : [22.04 FEAT] KVM: Secure Execution guest dump encryption with customer keys - kernel part
IBM BZ 196318 - LP1#959965 : [22.04 FEAT] KVM: Secure Execution guest dump encryption with customer keys - s390-tools part

Frank Heimes (fheimes)
affects: linux (Ubuntu) → qemu (Ubuntu)
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Changed in qemu (Ubuntu):
importance: Undecided → High
Changed in ubuntu-z-systems:
importance: Undecided → High
Changed in qemu (Ubuntu):
status: New → Incomplete
Changed in ubuntu-z-systems:
status: New → Incomplete
tags: added: qemu-22.04
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-03-20 19:31 EDT-------
Item didn't make it in time for jammy / 22.04, therefore we need to move this to Ubuntu 22.10.
Changing Target Milestone: from 22.04 ==> 22.10

tags: added: targetmilestone-inin2210
removed: targetmilestone-inin2204
Frank Heimes (fheimes)
summary: - [22.04 FEAT] KVM: Secure Execution guest dump encryption with customer
+ [22.10 FEAT] KVM: Secure Execution guest dump encryption with customer
keys - qemu part
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-09-12 00:57 EDT-------
This item didn't make it in time for kinetic / 22.10, therefore we have to move it to Ubuntu 23.04.
Changing Target Milestone to: 23.04

tags: added: targetmilestone-inin2304
removed: targetmilestone-inin2210
Frank Heimes (fheimes)
summary: - [22.10 FEAT] KVM: Secure Execution guest dump encryption with customer
+ [23.04 FEAT] KVM: Secure Execution guest dump encryption with customer
keys - qemu part
tags: added: qemu-23.04
removed: qemu-22.04
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Incomplete → New
Changed in qemu (Ubuntu):
status: Incomplete → New
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
I'm merging qemu 7.2 now, is this completed upstream by now?
Either in 7.2 or at least in the latest main branch to pick from?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-01-09 03:03 EDT-------
It's both in master and 7.2 AFAIK

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Confirmed
Changed in qemu (Ubuntu):
status: New → Confirmed
Frank Heimes (fheimes)
Changed in qemu (Ubuntu):
status: Confirmed → In Progress
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-02-14 10:18 EDT-------
For this item there's a fix available but it's not yet merged into qemu:
https://<email address hidden>/T/#t

Revision history for this message
Frank Heimes (fheimes) wrote :

Meanwhile 7.2 landed in lunar-proposed:
qemu | 1:7.2+dfsg-3ubuntu1 | lunar-proposed
hence updating ticket status to Fix Committed.

Changed in qemu (Ubuntu):
status: In Progress → Fix Committed
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.9 KiB)

This bug was fixed in the package qemu - 1:7.2+dfsg-4ubuntu1

---------------
qemu (1:7.2+dfsg-4ubuntu1) lunar; urgency=medium

  * Merge with Debian unstable (LP: #1993438), among many other fixes
    this resolvs these bugs:
    (LP: #1957924) - support for querying stats,
    (LP: #1853307) - Enhanced Interpretation for PCI Functions (s390x)
    (LP: #1959966) - guest dump encryption with customer keys (s390x)
    (LP: #1999885) - pv: don't allow userspace to set the clock under PV
    (LP: #1957924) - add filtering of statistics by target vCPU
    remaining changes:
    - qemu-kvm to systemd unit
      - d/qemu-kvm-init: script for QEMU KVM preparation modules, ksm,
        hugepages and architecture specifics
      - d/qemu-system-common.qemu-kvm.service: systemd unit to call
        qemu-kvm-init
      - d/qemu-system-common.install: install helper script
      - d/qemu-system-common.qemu-kvm.default: defaults for
        /etc/default/qemu-kvm
      - d/rules: call dh_installinit and dh_installsystemd for qemu-kvm
    - Distribution specific machine type
      (LP: 1304107 1621042 1776189 1761372 1761372 1776189)
      - d/p/ubuntu/define-ubuntu-machine-types.patch: define distro machine
        types containing release versioned machine attributes
      - d/qemu-system-x86.NEWS Info on fixed machine type defintions
        for host-phys-bits=true
      - Add an info about -hpb machine type in debian/qemu-system-x86.NEWS
      - ubuntu-q35 alias added to auto-select the most recent q35 ubuntu type
    - Enable nesting by default
      - d/p/ubuntu/enable-svm-by-default.patch: Enable nested svm by default
        in qemu64 on amd
        [ No more strictly needed, but required for backward compatibility ]
    - tolerate ipxe size change on migrations to >=18.04 (LP: 1713490)
      - d/p/ubuntu/pre-bionic-256k-ipxe-efi-roms.patch: old machine types
        reference 256k path
      - d/control-in: depend on ipxe-qemu-256k-compat-efi-roms to be able to
        handle incoming migrations from former releases.
    - d/qemu-system-x86.README.Debian: add info about updated nesting changes
    - Ease the use of module retention on upgrades (LP 1913421)
      - debian/qemu-block-extra.postinst: enable mount unit on install/upgrade
    - d/control-in: switch qemu-system-x86-xen to qemu-system-xen as this
      landed in Debian but under a different name.
    - Remaining GCC-12 FTBFS (LP 1988710 + LP 1921664)
      + d/p/u/qboot-Disable-LTO-for-ELF-binary-build-step.patch:
        fix qboot FTBFS with LTO
  * Dropped Changes [now part of upstream v7.2.0]
    - d/p/u/lp1994002-migration-Read-state-once.patch: Fix for libvirt
      error 'migration was active, but no RAM info was set' (LP 1994002)
    - d/p/u/ebpf-replace-deprecated-bpf_program__set_socket_filt.patch:
      Fix FTBFS with libbpf 1.0.1-2.
      + Header updates that were added as part of the libbpf fixes
        but not mentioned in changelog
    - d/p/u/lp-1981339-*: fix s390x system emulation (LP 1981339)
    - Fix I/O stalls when using NVMe storage (LP 1970737).
      + d/p/lp1970737-linux-aio-*.patch: Fix unbalanced plugged counter
        in laio_io_unplug.
    - SECURITY UPDATE...

Changed in qemu (Ubuntu):
status: Fix Committed → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
information type: Private → Public
Revision history for this message
Frank Heimes (fheimes) wrote :

To get this as SRU into jammy/22.04, we would need a list of commits (maybe backports) that fit to qemu 6.2 that we have in jammy.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-08-08 08:53 EDT-------
Hey Frank, could you give me a repo url for qemu as well?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-08-15 09:00 EDT-------
I have a lot of problems when trying to compile a jammy qemu and I haven't yet been able to fix them.
Fortunately I was at least able to apply the following patches on top of the qemu jammy-devel source git (after a quilt push -a). If possible try to compile them and send me a ppa or deb file for testing.

These patches include the fix for paused VMs if a dump is being written onto a file system that's too small.

d12a91e0ba target/s390x/arch_dump: Add arch cleanup function for PV dumps
e72629e514 dump: Add arch cleanup function
816644b121 target/s390x/dump: Remove unneeded dump info function pointer init
4376a770c7 target/s390x/arch_dump: Simplify memory allocation in s390x_write_elf64_notes()
eb60026120 target/s390x/arch_dump: Fix memory corruption in s390x_write_elf64_notes()
ad3b2e693d s390x: Add protected dump cap
113d8f4e95 s390x: pv: Add dump support
1af0006ab9 dump: Replace opaque DumpState pointer with a typed one
9b72224f44 dump: Add architecture section and section string table support
13fd417ddc dump: Reintroduce memory_offset and section_offset
cb415fd61e dump: Write ELF section headers right after ELF header
e41ed29bce dump: Use a buffer for ELF section data and headers
94d788408d dump: fix kdump to work over non-aligned blocks
08df343874 dump: simplify a bit kdump get_next_page()
2341a94d3a dump: Rename write_elf*_phdr_note to prepare_elf*_phdr_note
670e76998a dump: Split elf header functions into prepare and write
c370d5300f dump: Rework dump_calculate_size function
dddf725f70 dump: Rework filter area variables
0c2994ac90 dump: Rework get_start_block
1e8113032f dump: Refactor dump_iterate and introduce dump_filter_memblock_*()
afae6056ea dump: Rename write_elf_loads to write_elf_phdr_loads
c68124738b dump: Consolidate elf note function
5ff2e5a3e1 dump: Cleanup dump_begin write functions
bc7d558017 dump: Consolidate phdr note writes
05bbaa5040 dump: Introduce dump_is_64bit() helper function
e71d353360 dump: Add more offset variables
344107e07b dump: Remove the section if when calculating the memory offset
862a395858 dump: Introduce shdr_num to decrease complexity
046bc4160b dump: Remove the sh_info variable
86a518bba4 dump: Use ERRP_GUARD()

Revision history for this message
Mitchell Dzurick (mitchdz) wrote :

I just checked trying to apply those patches. I was able to apply the following on top of jammy-devel:

113d8f4e95 s390x: pv: Add dump support
1af0006ab9 dump: Replace opaque DumpState pointer with a typed one
9b72224f44 dump: Add architecture section and section string table support
13fd417ddc dump: Reintroduce memory_offset and section_offset
cb415fd61e dump: Write ELF section headers right after ELF header
e41ed29bce dump: Use a buffer for ELF section data and headers
94d788408d dump: fix kdump to work over non-aligned blocks
08df343874 dump: simplify a bit kdump get_next_page()
2341a94d3a dump: Rename write_elf*_phdr_note to prepare_elf*_phdr_note
670e76998a dump: Split elf header functions into prepare and write
c370d5300f dump: Rework dump_calculate_size function
dddf725f70 dump: Rework filter area variables
0c2994ac90 dump: Rework get_start_block
1e8113032f dump: Refactor dump_iterate and introduce dump_filter_memblock_*()
afae6056ea dump: Rename write_elf_loads to write_elf_phdr_loads
c68124738b dump: Consolidate elf note function
5ff2e5a3e1 dump: Cleanup dump_begin write functions
bc7d558017 dump: Consolidate phdr note writes
05bbaa5040 dump: Introduce dump_is_64bit() helper function
e71d353360 dump: Add more offset variables
344107e07b dump: Remove the section if when calculating the memory offset
862a395858 dump: Introduce shdr_num to decrease complexity
046bc4160b dump: Remove the sh_info variable
86a518bba4 dump: Use ERRP_GUARD()

However

ad3b2e693d s390x: Add protected dump cap

Fails to apply. Were you able to apply all of those patches without issue?

I see ad3b2e693d conflicts with the file `target/s390x/kvm/kvm.c`

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-08-19 02:50 EDT-------
Commit ad3b2e693d applies cleanly for me on the following base commit + quilt patches
8a4bd971eb (tag: import/1%6.2+dfsg-2ubuntu6.22)

Does the git cherry-pick fail and if yes, where?
That's one of the easiest patches to have a conflict, so I'm surprised.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi Frank,

Apologies for taking a while to reply; I've been very busy adjusting QEMU on Oracular, since we're on Feature Freeze right now.

Anyway, I will be taking another look at this bug first thing tomorrow morning and will report back.

Thanks for providing the list of patches.

Changed in qemu (Ubuntu Jammy):
status: New → Triaged
assignee: nobody → Sergio Durigan Junior (sergiodj)
tags: added: server-todo
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi again,

I've backported and applied all patches on top of Jammy's QEMU. The package (version 1:6.2+dfsg-2ubuntu6.23~ppa1) is now building here:

https://launchpad.net/~sergiodj/+archive/ubuntu/qemu/+packages

Assuming that the build passes, could you please give it a try and see if it works?

Thanks.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Spoke too soon. There's a build failure:

../../target/s390x/arch_dump.c: In function ‘s390x_write_elf64_pv’:
../../target/s390x/arch_dump.c:191:36: error: ‘NT_S390_PV_CPU_DATA’ undeclared (first use in this function)
  191 | note->hdr.n_type = cpu_to_be32(NT_S390_PV_CPU_DATA);
      | ^~~~~~~~~~~~~~~~~~~
../../target/s390x/arch_dump.c:191:36: note: each undeclared identifier is reported only once for each function it appears in
../../target/s390x/arch_dump.c:195:5: warning: implicit declaration of function ‘kvm_s390_dump_cpu’; did you mean ‘kvm_s390_mem_op’? [-Wimplicit-function-declaration]
  195 | kvm_s390_dump_cpu(cpu, &note->contents.dynamic);
      | ^~~~~~~~~~~~~~~~~
      | kvm_s390_mem_op
../../target/s390x/arch_dump.c:195:5: warning: nested extern declaration of ‘kvm_s390_dump_cpu’ [-Wnested-externs]
../../target/s390x/arch_dump.c: At top level:
../../target/s390x/arch_dump.c:220:9: error: ‘kvm_s390_pv_dmp_get_size_cpu’ undeclared here (not in a function)
  220 | {0, kvm_s390_pv_dmp_get_size_cpu, s390x_write_elf64_pv, true},
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

It looks like we'll need to backport more patches? I'll look into it.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

5433669c7a1 include/elf.h: add s390x note types

and

03d83ecfae4 s390x: Introduce PV query interface

seem to be missing.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

753ca06f470 s390x: Add KVM PV dump interface

is also missing.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

@Frank,

OK, I think I found all the missing commits and backported them. Could you please give the package a try? The version now is 1:6.2+dfsg-2ubuntu6.23~ppa3.

Thanks.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-08-22 04:09 EDT-------
Hey Sergio, I was unavailable the last two days, but I'll test your package today.
Sorry that I missed those commits and caused more work.

On a side note: My first name is Janosch and it makes sense to use it when mentioning me, since fheimes is also around.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-08-22 09:50 EDT-------
The provided package works without a problem and I've verified that the guest keeps on running when dumping to a file system that's smaller than the dump size.

I've used the patches that I've attached to the kernel bug to add the feature to KVM and built s390-tools from the latest git since there's no backport yet for 22.04.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi Janosch :-),

Thanks for the tests. I'll talk to Frank and proceed with the SRU process.

Frank Heimes (fheimes)
description: updated
Frank Heimes (fheimes)
Changed in qemu (Ubuntu Jammy):
status: Triaged → In Progress
Frank Heimes (fheimes)
description: updated
description: updated
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-09-09 10:11 EDT-------
Verified with 5.15.0-118-generic #128~lp1959940-Ubuntu and QEMU 1:6.2+dfsg-2ubuntu6.23~ppa4
Also tested the pause fix.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi Janosch,

Thanks for the verification.

I talked to Frank and decided to proceed with the upload now. This will allow the SRU team to "digest" the changes until the kernel changes are SRU'ed.

Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.