[UBUNTU 20.04.1] qemu (secure guest) crash due to gup_fast / dynamic page table folding issue
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
Critical
|
Skipper Bug Screeners | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Frank Heimes | ||
Focal |
Fix Released
|
Undecided
|
Frank Heimes | ||
Groovy |
Fix Released
|
Undecided
|
Frank Heimes |
Bug Description
Justification:
==============
Secure KVM guest (using secure execution on Ubuntu Server 20.04 for s390x)
crashes happen from time to time during boot.
Such crashed guests ("reason=crashed" in the libvirt log) end up in hutoff state instead of crashed state (<on_crash> preserve is set).
The crash points to a kernel memory management problem, addressed by the following patch/fix.
The modifications touch common memory management code,
but it will have no effect to architectures other than s390x.
This is ensured by the fact that only s390 provides / implements the new helper functions.
And for s390x, this is actually a critical (and carefully tested) fix for a (previous) regression, so it can hardly get any more regressive.
The patch landed upstream in linux-next, is in depth discussed
at LKML https:/
and here https:/
and will soon land via the regular upstream stable release update for kernel 5.4 in focal, too.
The process already started:
https:/
Hence this cherry-pick from the upstream patch should be added to groovy
to avoid any potential regression in case the patch landed in focal via the upstream release update process,
but is not in groovy and someones upgrades from focal to groovy.
__________
Secure Execution with Ubuntu 20.04, secure guest crash during boot from time to time, crashed guest went into Shufoff state instead of Crashed state (<on_crash>preserve is set), so I can't get a dump.
libvirt log file:
2020-04-
2020-04-21 16:35:44.831+0000: shutting down, reason=crashed
---uname output---
Linux ubu204uclg1002 5.4.0-25-generic #29-Ubuntu SMP Fri Apr 17 15:05:32 UTC 2020 s390x s390x s390x GNU/Linux
Machine Type = z15 8561
---Debugger---
A debugger is not configured
---Steps to Reproduce---
I have a setup with 72 KVM guests which I can start in secure or non-secure mode. Starting all of them in secure mode back to back results in a number of guests (4..8) in Shutoff state and reason=crashed in the libvirt log. I can manually start the guest again.... no problem. Different guests are failing.
Host and guests are on latest Ubuntu 20.04.
The supposed fix (kernel memory management) has landed in Andrew Mortons mm
tree
https:/
Please note: while this was found with secure execution, the bug is actually present for non-KVM workloads as well.
The complete patch is this:
https:/
summary: |
- Secure guest (qemu) crash during boot (mostly) but also while running - workload (rare) (secure execution) + [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also + while running workload (rare) (secure execution) |
Changed in ubuntu-z-systems: | |
importance: | Undecided → Medium |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
Changed in linux (Ubuntu): | |
assignee: | Skipper Bug Screeners (skipper-screen-team) → Frank Heimes (fheimes) |
Changed in linux (Ubuntu Focal): | |
assignee: | nobody → Frank Heimes (fheimes) |
summary: |
- [UBUNTU 20.04.1]Secure guest (qemu) crash during boot (mostly) but also - while running workload (rare) (secure execution) + [UBUNTU 20.04.1] qemu (secure guest) crash due to gup_fast / dynamic + page table folding issue |
Changed in linux (Ubuntu Groovy): | |
status: | In Progress → Fix Committed |
Default Comment by Bridge