[UBUNTU 20.04.1] qemu (secure guest) crash due to gup_fast / dynamic page table folding issue

Bug #1896726 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Critical
Skipper Bug Screeners
linux (Ubuntu)
Undecided
Frank Heimes
Focal
Undecided
Frank Heimes
Groovy
Undecided
Frank Heimes

Bug Description

Justification:
==============

Secure KVM guest (using secure execution on Ubuntu Server 20.04 for s390x)
crashes happen from time to time during boot.
Such crashed guests ("reason=crashed" in the libvirt log) end up in hutoff state instead of crashed state (<on_crash> preserve is set).
The crash points to a kernel memory management problem, addressed by the following patch/fix.
The modifications touch common memory management code,
but it will have no effect to architectures other than s390x.
This is ensured by the fact that only s390 provides / implements the new helper functions.
And for s390x, this is actually a critical (and carefully tested) fix for a (previous) regression, so it can hardly get any more regressive.
The patch landed upstream in linux-next, is in depth discussed
at LKML https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
and here https://lore.kernel.org/linux-arch<email address hidden>/
and will soon land via the regular upstream stable release update for kernel 5.4 in focal, too.
The process already started:
https://lore.kernel.org/stable/patch-1.thread<email address hidden>/

Hence this cherry-pick from the upstream patch should be added to groovy
to avoid any potential regression in case the patch landed in focal via the upstream release update process,
but is not in groovy and someones upgrades from focal to groovy.

__________

Secure Execution with Ubuntu 20.04, secure guest crash during boot from time to time, crashed guest went into Shufoff state instead of Crashed state (<on_crash>preserve is set), so I can't get a dump.

libvirt log file:
2020-04-21T16:35:39.382999Z qemu-system-s390x: Guest says index 19608 is available
2020-04-21 16:35:44.831+0000: shutting down, reason=crashed

---uname output---
Linux ubu204uclg1002 5.4.0-25-generic #29-Ubuntu SMP Fri Apr 17 15:05:32 UTC 2020 s390x s390x s390x GNU/Linux

Machine Type = z15 8561

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 I have a setup with 72 KVM guests which I can start in secure or non-secure mode. Starting all of them in secure mode back to back results in a number of guests (4..8) in Shutoff state and reason=crashed in the libvirt log. I can manually start the guest again.... no problem. Different guests are failing.
Host and guests are on latest Ubuntu 20.04.

The supposed fix (kernel memory management) has landed in Andrew Mortons mm
tree
https://lore.kernel.org/mm-commits/20200916003608.ib4Ln%<email address hidden>/T/#u

Please note: while this was found with secure execution, the bug is actually present for non-KVM workloads as well.

The complete patch is this:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=a338e69ba37286c0fc300ab7e6fa0227e6ca68b1

Revision history for this message
bugproxy (bugproxy) wrote : example log file of a crashed secure guest

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-185431 severity-medium targetmilestone-inin2004
Revision history for this message
bugproxy (bugproxy) wrote : removed scsi definition

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : resulting log

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : qemu core dump

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : info output with tracebacks

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : qemu coredump

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : info file

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : qemu coredump while running workload on Secure guest

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : core dump at g_main_context_check: different case ?

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote :

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Coredump

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Look for suspicious buffers

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
summary: - Secure guest (qemu) crash during boot (mostly) but also while running
- workload (rare) (secure execution)
+ [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also
+ while running workload (rare) (secure execution)
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → Medium
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Changed in linux (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Frank Heimes (fheimes)
Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also while running workload (rare) (secure execution)

The patch/commit mentioned landed in linux-next (just tagged with next-20200921).

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-09-23 04:54 EDT-------
This problem exist also with the current 5.4.0.48 kernel!

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-23 05:27 EDT-------
Severity adapted to critical. Please be aware of that. Many thx

tags: added: severity-critical
removed: severity-medium
Frank Heimes (fheimes)
Changed in linux (Ubuntu Focal):
assignee: nobody → Frank Heimes (fheimes)
Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also while running workload (rare) (secure execution)

While working on this bug a patched kernel was created and made available here:
https://people.canonical.com/~fheimes/lp1896726/
But it's a patched groovy kernel only, since the commit/patch doesn't cleanly apply to focal master-next:

git apply --stat ~/0001-mm-gup-fix-gup_fast-with-dynamic-page-table-folding.patch
 arch/s390/include/asm/pgtable.h | 42 ++++++++++++++++++++++++++++-----------
 include/linux/pgtable.h | 10 +++++++++
 mm/gup.c | 18 ++++++++---------
 3 files changed, 49 insertions(+), 21 deletions(-)

git apply --check ~/0001-mm-gup-fix-gup_fast-with-dynamic-page-table-folding.patch
error: patch failed: arch/s390/include/asm/pgtable.h:1260
error: arch/s390/include/asm/pgtable.h: patch does not apply
error: include/linux/pgtable.h: No such file or directory

Please always check if a patch/commit applies cleanly to the master-next tree of the target Ubuntu kernel releases. If not double check if any further commits are needed or if a even a backport is required.

There is btw. a significant regression risk associated to this patch, since it touches general memory management (page table handling and address translation) and gup code - and this includes common code, where special care taking is needed, since it may potentially affect millions of installations across all supported platforms.

Hence my recommendation is to bring this in with the help of an upstream stable release update (https://www.kernel.org/doc/html/v5.4/process/stable-kernel-rules.html) to upstream kernel 5.4 - which will then be (more or less) automatically be picked up by the Canonical kernel team for focal.

Changed in linux (Ubuntu Groovy):
status: New → Triaged
Changed in linux (Ubuntu Focal):
status: New → Incomplete
Changed in ubuntu-z-systems:
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-09-23 15:37 EDT-------
(In reply to comment #161)
> While working on this bug a patched kernel was created and made available
> here:
> https://people.canonical.com/~fheimes/lp1896726/
> But it's a patched groovy kernel only, since the commit/patch doesn't
> cleanly apply to focal master-next:
>
> git apply --stat
> ~/0001-mm-gup-fix-gup_fast-with-dynamic-page-table-folding.patch
> arch/s390/include/asm/pgtable.h | 42
> ++++++++++++++++++++++++++++-----------
> include/linux/pgtable.h | 10 +++++++++
> mm/gup.c | 18 ++++++++---------
> 3 files changed, 49 insertions(+), 21 deletions(-)
>
> git apply --check
> ~/0001-mm-gup-fix-gup_fast-with-dynamic-page-table-folding.patch
> error: patch failed: arch/s390/include/asm/pgtable.h:1260
> error: arch/s390/include/asm/pgtable.h: patch does not apply
> error: include/linux/pgtable.h: No such file or directory
>
> Please always check if a patch/commit applies cleanly to the master-next
> tree of the target Ubuntu kernel releases. If not double check if any
> further commits are needed or if a even a backport is required.
>
> There is btw. a significant regression risk associated to this patch, since
> it touches general memory management (page table handling and address
> translation) and gup code - and this includes common code, where special
> care taking is needed, since it may potentially affect millions of
> installations across all supported platforms.
>
> Hence my recommendation is to bring this in with the help of an upstream
> stable release update
> (https://www.kernel.org/doc/html/v5.4/process/stable-kernel-rules.html) to
> upstream kernel 5.4 - which will then be (more or less) automatically be
> picked up by the Canonical kernel team for focal.

This conflict was already noticed by stable maintainer, see https://<email address hidden>/

An adjusted version of the patch was sent for v5.4 stable, see https://lore.kernel.org/stable/patch-1.thread<email address hidden>/

Therefore, our expectation is that this will now go in to v5.4 stable via the normal process. Of course, you could also take the adjusted version from the link above.

Regarding the regression risk, while the patch does touch scary common memory management code, it will have absolutely no effect on other architectures than s390. This is ensured by the fact that only s390 provides / implements the new helper functions. And for s390, this actually is a critical (and carefully tested) fix for a (previous) regression, so it can hardly get any more regressive...

Frank Heimes (fheimes)
summary: - [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also
- while running workload (rare) (secure execution)
+ [UBUNTU 20.04.1] qemu (secure guest) crash due to gup_fast / dynamic
+ page table folding issue
Revision history for this message
Frank Heimes (fheimes) wrote :

Hi Gerald, I wasn't aware that you already started to work on/with upstream stable - that's great!

I had a look at the backport at https://lore.kernel.org/stable/patch-1.thread<email address hidden>/ and it applied cleanly on current focal master-next.
So I've built a patched focal kernel - in addition to the above groovy kernel - and share it here as well for any further testing: https://people.canonical.com/~fheimes/lp1896726/

I just sent a patch request for groovy based on a cherry-pick from upstream:
https://lists.ubuntu.com/archives/kernel-team/2020-September/thread.html#113731
hence changing status for groovy to 'In Progress'.

The patch must land in groovy too, to avoid any potential regression once it landed in focal, but not is not in groovy and someone upgrades from focal to groovy...

I'll keep an eye on the upstream stable release process and try to keep this bug in sync and updated, based on the upstream stable bug that will eventually be opened by the kernel team...

I'll add the summary that I've added to the patch request for further reference to the bug description here.

description: updated
Changed in linux (Ubuntu Groovy):
status: Triaged → In Progress
Changed in ubuntu-z-systems:
status: Incomplete → In Progress
importance: Medium → Critical
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-28 01:44 EDT-------
Fix is upstream as

commit d3f7b1bb204099f2f7306318896223e8599bb6a2
Author: Vasily Gorbik <email address hidden>
AuthorDate: Fri Sep 25 21:19:10 2020 -0700
Commit: Linus Torvalds <email address hidden>
CommitDate: Sat Sep 26 10:33:57 2020 -0700

mm/gup: fix gup_fast with dynamic page table folding

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d3f7b1bb204099f2f7306318896223e8599bb6a2

Revision history for this message
Frank Heimes (fheimes) wrote :

I took this from "linux-next" (where it was tagged with 'next-20200923') as:
~/linux-next$ git show a02b55ea66b9
commit a02b55ea66b9257744528da609a26279152a3bc3
Author: Vasily Gorbik <email address hidden>
Date: Wed Sep 23 09:49:28 2020 +1000

    mm/gup: fix gup_fast with dynamic page table folding

    Currently to make sure that every page table entry is read just once
    gup_fast walks perform READ_ONCE and pass pXd value down to the next
    gup_pXd_range function by value e.g.:

    static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                             unsigned int flags, struct page **pages, int *nr)
    ...

and built a patched groovy and a patched focal kernel, available here:
https://people.canonical.com/~fheimes/lp1896726/

Do you have a chance giving these a try?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-29 00:26 EDT-------
(In reply to comment #163)
> Hi Gerald, I wasn't aware that you already started to work on/with upstream
> stable - that's great!
>
> I had a look at the backport at
> https://lore.kernel.org/stable/patch-1.thread-41918b.git-41918be365c0.your-
> <email address hidden>/ and it applied cleanly on
> current focal master-next.
> So I've built a patched focal kernel - in addition to the above groovy
> kernel - and share it here as well for any further testing:
> https://people.canonical.com/~fheimes/lp1896726/
>

I've tested the provided focal kernel. It passes my tests.

Revision history for this message
Frank Heimes (fheimes) wrote :

Thx for testing, that definitely gives even more confidence ...

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-29 03:27 EDT-------
(In reply to comment #166)
> I took this from "linux-next" (where it was tagged with 'next-20200923') as:
> ~/linux-next$ git show a02b55ea66b9
> commit a02b55ea66b9257744528da609a26279152a3bc3
> Author: Vasily Gorbik <email address hidden>
> Date: Wed Sep 23 09:49:28 2020 +1000
>
> mm/gup: fix gup_fast with dynamic page table folding
>
> Currently to make sure that every page table entry is read just once
> gup_fast walks perform READ_ONCE and pass pXd value down to the next
> gup_pXd_range function by value e.g.:
>
> static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
> unsigned int flags, struct page **pages, int *nr)
> ...
>
> and built a patched groovy and a patched focal kernel, available here:
> https://people.canonical.com/~fheimes/lp1896726/
>
> Do you have a chance giving these a try?

I'll run a test with my MongoDB setup.
Can you provide the debug symbol package for the kernel as well?

Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-29 08:58 EDT-------
With kernel ubu204 5.4.0-49-generic from Frank Heimes/Canonical:
ran a 8 and 16 guest (secure) scenario with MongoDB, same performance, no qemu crash, no wiredtiger crash .... looks good to me.

Revision history for this message
Frank Heimes (fheimes) wrote :

Okay - very good - many thx, Klaus!

Frank Heimes (fheimes)
Changed in linux (Ubuntu Groovy):
status: In Progress → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

For the reason of completeness this is the link to the patch request for groovy:
https://lists.ubuntu.com/archives/kernel-team/2020-September/thread.html#113731
and it got already applied to groovy master-next.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.5 KiB)

This bug was fixed in the package linux - 5.8.0-21.22

---------------
linux (5.8.0-21.22) groovy; urgency=medium

  * groovy/linux: 5.8.0-21.22 -proposed tracker (LP: #1898150)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * Fix broken e1000e device after S3 (LP: #1897755)
    - SAUCE: e1000e: Increase polling timeout on MDIC ready bit

  * EFA: add support for 0xefa1 devices (LP: #1896791)
    - RDMA/efa: Expose maximum TX doorbell batch
    - RDMA/efa: Expose minimum SQ size
    - RDMA/efa: User/kernel compatibility handshake mechanism
    - RDMA/efa: Add EFA 0xefa1 PCI ID

  * Groovy update: v5.8.13 upstream stable release (LP: #1898076)
    - device_cgroup: Fix RCU list debugging warning
    - ASoC: pcm3168a: ignore 0 Hz settings
    - ASoC: wm8994: Skip setting of the WM8994_MICBIAS register for WM1811
    - ASoC: wm8994: Ensure the device is resumed in wm89xx_mic_detect functions
    - ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN Converter9 2-in-1
    - clk: versatile: Add of_node_put() before return statement
    - RISC-V: Take text_mutex in ftrace_init_nop()
    - i2c: aspeed: Mask IRQ status to relevant bits
    - s390/init: add missing __init annotations
    - lockdep: fix order in trace_hardirqs_off_caller()
    - EDAC/ghes: Check whether the driver is on the safe list correctly
    - drm/amdkfd: fix a memory leak issue
    - drm/amd/display: Don't use DRM_ERROR() for DTM add topology
    - drm/amd/display: update nv1x stutter latencies
    - drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC is
    - drm/amd/display: Don't log hdcp module warnings in dmesg
    - objtool: Fix noreturn detection for ignored functions
    - i2c: mediatek: Send i2c master code at more than 1MHz
    - riscv: Fix Kendryte K210 device tree
    - ieee802154: fix one possible memleak in ca8210_dev_com_init
    - ieee802154/adf7242: check status of adf7242_read_reg
    - clocksource/drivers/h8300_timer8: Fix wrong return value in
      h8300_8timer_init()
    - batman-adv: bla: fix type misuse for backbone_gw hash indexing
    - libbpf: Fix build failure from uninitialized variable warning
    - atm: eni: fix the missed pci_disable_device() for eni_init_one()
    - batman-adv: mcast/TT: fix wrongly dropped or rerouted packets
    - netfilter: ctnetlink: add a range check for l3/l4 protonum
    - netfilter: ctnetlink: fix mark based dump filtering regression
    - netfilter: conntrack: nf_conncount_init is failing with IPv6 disabled
    - netfilter: nft_meta: use socket user_ns to retrieve skuid and skgid
    - mac802154: tx: fix use-after-free
    - bpf: Fix clobbering of r2 in bpf_gen_ld_abs
    - tools/libbpf: Avoid counting local symbols in ABI check
    - drm/vc4/vc4_hdmi: fill ASoC card owner
    - net: qed: Disable aRFS for NPAR and 100G
    - net: qede: Disable aRFS for NPAR and 100G
    - net: qed: RDMA personality shouldn't fail VF load
    - igc: Fix wrong timestamp latency numbers
    - igc: Fix not considering the TX delay for timestamps
    - drm/sun4i: sun8i-csc: Secondary CSC register correction
    - hv_netvsc: Switch the data path at the right time during hibernation
    - spi: spi-fsl-dspi:...

Changed in linux (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Frank Heimes (fheimes) wrote :

Since the patch is now coming via upstream stable to focal (https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.4.69),
watching out for a LP ticket "Focal update: v5.4.69 upstream stable release"...

Changed in linux (Ubuntu Focal):
status: Incomplete → Triaged
Revision history for this message
Frank Heimes (fheimes) wrote :

The focal part of this ticket is addressed by:
"Focal update: v5.4.69 upstream stable release"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900624
And LP 1900624 is Fix Committed and the patches are already applied to focal master-next.
Hence aligning the status of this ticket to LP 1900624 and updating it to Fix Committed.

Changed in linux (Ubuntu Focal):
status: Triaged → Fix Committed
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

The bug "Focal update: v5.4.69 upstream stable release"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900624
was just updated to Fix released, hence aligning the status of the focal entry here
and with that closing this ticket as Fix Released.

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-12-01 04:29 EDT-------
IBM Bugzilla status->closed, Fix Released by all requested Distros

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers