[UBUNTU 20.04] PSI generates overhead on s390x

Bug #1876044 reported by bugproxy on 2020-04-30
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Medium
Skipper Bug Screeners
linux (Ubuntu)
Undecided
Canonical Kernel Team
Focal
Undecided
Unassigned

Bug Description

SRU Justification:
==================

[Impact]

* PSI is enabled by default for all architectures in Ubuntu.

* On s390x this leads to performance degradations on popular workloads like web serving (nginx).

[Fix]

* Leave 'CONFIG_PSI=y', but change 'CONFIG_PSI_DEFAULT_DISABLED=n' to 'CONFIG_PSI_DEFAULT_DISABLED=y'

[Test Case]

* Measure the overhead with 'CONFIG_PSI_DEFAULT_DISABLED=n' and 'CONFIG_PSI_DEFAULT_DISABLED=y' on the same environment with nginx.

[Regression Potential]

* The regression potential can be considered as moderate, since PSI (Pressure stall information tracking),

* since PSI is just used to collect CPU overcommitted, memory and IO metrics.

* And it can be enabled again with the kernel argument.
__________

PSI is always enabled in Ubuntu 20.04.
For a test system with 72 guests on 8 cores running a nginx workload this created an overhead of ~1%.

Can we change this back to
CONFIG_PSI=y
CONFIG_PSI_DEFAULT_DISABLED=y

so that by default the overhead is not there but for debugging or if needed it can still be enabled via kernel parm?

Maybe there has been a reason for this - so feel free to discuss.

---uname output---
Linux t35lp76 5.4.0-26-generic #30-Ubuntu SMP Mon Apr 20 16:57:22 UTC 2020 s390x s390x s390x GNU/Linux

Machine Type = All s390x architecture

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 root@t35lp76:/boot# grep PSI config-5.4.0-26-generic
CONFIG_PSI=y
# CONFIG_PSI_DEFAULT_DISABLED is not set

Stack trace output:
 no

Oops output:
 no

System Dump Info:
  The system is not configured to capture a system dump.

*Additional Instructions for epasch@de,ibm.com:
-Attach sysctl -a output output to the bug.

bugproxy (bugproxy) on 2020-04-30
tags: added: architecture-s3903164 bugnameltc-185697 severity-medium targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes) on 2020-04-30
Changed in linux (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Canonical Kernel Team (canonical-kernel-team)
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
importance: Undecided → Medium
status: New → Triaged
Frank Heimes (fheimes) on 2020-05-05
summary: - [UBUNTU 20.04] Overhead introduced by PSI
+ [UBUNTU 20.04] PSI generated overhead on s390x
summary: - [UBUNTU 20.04] PSI generated overhead on s390x
+ [UBUNTU 20.04] PSI generates overhead on s390x
Frank Heimes (fheimes) wrote :

Request submitted to kernel team's mailing list:
https://lists.ubuntu.com/archives/kernel-team/2020-May/thread.html#109592
changing status to 'In Progress'

description: updated
Changed in linux (Ubuntu):
status: New → In Progress
Changed in ubuntu-z-systems:
status: Triaged → In Progress
Changed in linux (Ubuntu Focal):
status: New → In Progress
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Frank Heimes (fheimes) on 2020-05-15
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 5.4.0-37.41

---------------
linux (5.4.0-37.41) focal; urgency=medium

  * CVE-2020-0543
    - SAUCE: x86/speculation/spectre_v2: Exclude Zhaoxin CPUs from SPECTRE_V2
    - SAUCE: x86/cpu: Add a steppings field to struct x86_cpu_id
    - SAUCE: x86/cpu: Add 'table' argument to cpu_matches()
    - SAUCE: x86/speculation: Add Special Register Buffer Data Sampling (SRBDS)
      mitigation
    - SAUCE: x86/speculation: Add SRBDS vulnerability and mitigation documentation
    - SAUCE: x86/speculation: Add Ivy Bridge to affected list

 -- Marcelo Henrique Cerri <email address hidden> Wed, 03 Jun 2020 11:24:23 -0300

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released

All autopkgtests for the newly accepted linux-oracle-5.4 (5.4.0-1019.19~18.04.1) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

zfs-linux/unknown (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#linux-oracle-5.4

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 5.4.0-42.46

---------------
linux (5.4.0-42.46) focal; urgency=medium

  * focal/linux: 5.4.0-42.46 -proposed tracker (LP: #1887069)

  * linux 4.15.0-109-generic network DoS regression vs -108 (LP: #1886668)
    - SAUCE: Revert "netprio_cgroup: Fix unlimited memory leak of v2 cgroups"

linux (5.4.0-41.45) focal; urgency=medium

  * focal/linux: 5.4.0-41.45 -proposed tracker (LP: #1885855)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * CVE-2019-19642
    - kernel/relay.c: handle alloc_percpu returning NULL in relay_open

  * CVE-2019-16089
    - SAUCE: nbd_genl_status: null check for nla_nest_start

  * CVE-2020-11935
    - aufs: do not call i_readcount_inc()

  * ip_defrag.sh in net from ubuntu_kernel_selftests failed with 5.0 / 5.3 / 5.4
    kernel (LP: #1826848)
    - selftests: net: ip_defrag: ignore EPERM

  * Update lockdown patches (LP: #1884159)
    - SAUCE: acpi: disallow loading configfs acpi tables when locked down

  * seccomp_bpf fails on powerpc (LP: #1885757)
    - SAUCE: selftests/seccomp: fix ptrace tests on powerpc

  * Introduce the new NVIDIA 418-server and 440-server series, and update the
    current NVIDIA drivers (LP: #1881137)
    - [packaging] add signed modules for the 418-server and the 440-server
      flavours

 -- Khalid Elmously <email address hidden> Thu, 09 Jul 2020 19:50:26 -0400

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Frank Heimes (fheimes) on 2020-07-28
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released

------- Comment From <email address hidden> 2020-07-28 02:36 EDT-------
IBM Bugzilla status-> closed, Fix Released with focal

tags: added: targetmilestone-inin2004
removed: targetmilestone-inin---
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers