[UBUNTU 20.04] rcu stalls with many storage key guests
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
Medium
|
Skipper Bug Screeners | ||
linux (Ubuntu) |
Invalid
|
Medium
|
Skipper Bug Screeners | ||
Focal |
Fix Released
|
Medium
|
Canonical Kernel Team | ||
Impish |
Won't Fix
|
Medium
|
Canonical Kernel Team | ||
Jammy |
Fix Released
|
Medium
|
Canonical Kernel Team |
Bug Description
SRU Justification:
==================
[Impact]
* Ubuntu on s390x KVM environments with lots of large guests with storage
keys can be affected by rcu stalls.
* These rcu stalls can cause the system to crash/dump.
[Fix]
* 3ae11dbcfac9 3ae11dbcfac906a
* 6d5946274df1 6d5946274df1fff
[Test Plan]
* There is no trigger or direct test or re-creation of the
problem situation possible, but...
* and IBM z13 or LinuxONE (or never) LPAR is needed that
runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
and acts as KVM host with again several large guests running
on top with storage groups.
* Let such a system running for days under significant load
and watch the logs for rcu issues.
* Prior to the submission of this SRU patched test kernels
for focal 5.4 and bionic hwe-5.4 were created and tested.
They ran for days at a staging environemnt at IBM
without further issues.
* The modifications are all limited to s390x.
* A test kernel was build (see below) that ran in a test environment
at IBM under appropriate load for several days.
[Where problems could occur]
* Due to the change for the KVM switch to keyed guest
from classic sske to non-quiescing sske
the KVM behaviour might have changed and the storage keys harmed.
* The now more generous scheduling while setting keys
has an impact on the guest memory management and mapping
which will lead to a different performance.
* This, with the introduction of __s390_
cond_resched, might increase the overhead in certain situations,
but eventually improves the responsiveness over time,
hence avoid rcu stalls.
[Other Info]
* Since the patches are upstream in 5.19-rc1,
they will be included in the kernel that is planned for kinetic (5.19).
* Hence this is an SRU to jammy, impish and focal.
__________
---Problem Description---
There can be rcu stalls when running lots of large guests with storage keys:
[1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
[1377614.579845] rcu: 18-....: (2099 ticks this GP) idle=54e/
[1377614.579895] (t=2100 jiffies g=155867385 q=20879)
[1377614.579898] Task dump for CPU 18:
[1377614.579899] CPU 1/KVM R running task 0 1030947 256019 0x06000004
[1377614.579902] Call Trace:
[1377614.579912] ([<0000001f1f4b
[1377614.579918] [<0000001f1ec8e
[1377614.579919] [<0000001f1f4b7
[1377614.579924] [<0000001f1ecdd
[1377614.579926] [<0000001f1eceb
[1377614.579931] [<0000001f1ecfc
[1377614.579932] [<0000001f1ecfd
[1377614.579933] [<0000001f1ecec
[1377614.579935] [<0000001f1ecec
[1377614.579938] [<0000001f1ebec
[1377614.579942] [<0000001f1f4c6
[1377614.579945] [<0000001f1ec0a
Contact Information = <email address hidden>
---uname output---
RELEASE: 5.4.0-90-generic
VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021
== Comment: #1 - Christian Borntraeger <email address hidden> - 2022-05-24 03:59:37 ==
This is a test patch that might address the rcu stalls.
== Comment: #2 - Christian Borntraeger <email address hidden> - 2022-05-24 04:00:22 ==
This is a 2nd patch that reduces the cost of key setting.
CVE References
Changed in ubuntu-z-systems: | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
Changed in ubuntu-z-systems: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
Changed in linux (Ubuntu): | |
status: | In Progress → Invalid |
Changed in linux (Ubuntu Focal): | |
status: | New → In Progress |
Changed in linux (Ubuntu Impish): | |
status: | New → In Progress |
Changed in linux (Ubuntu Jammy): | |
status: | New → In Progress |
description: | updated |
Changed in linux (Ubuntu Focal): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Impish): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Jammy): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Focal): | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in linux (Ubuntu Impish): | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in linux (Ubuntu Jammy): | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in linux (Ubuntu Jammy): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Impish): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
tags: |
added: targetmilestone-inin2004 removed: targetmilestone-inin--- |
tags: |
added: verification-done-focal verification-done-jammy removed: verification-needed-focal verification-needed-jammy |
Default Comment by Bridge