Check for CPU Measurement sampling

Bug #1847590 reported by bugproxy on 2019-10-10
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Medium
Frank Heimes
linux (Ubuntu)
Undecided
Skipper Bug Screeners
Bionic
Undecided
Unassigned
Disco
Undecided
Unassigned
Eoan
Undecided
Skipper Bug Screeners

Bug Description

SRU Justification:
==================

[Impact]

* Check for CPU Measurement sampling to avoid potential loss of sampling data

[Fix]

* 932bfc5aae08f3cb20c1c9f051542f5933710151 932bfc5 "s390/cpumsf: Check for CPU Measurement sampling"

[Test Case]

* Have an LPAR configured with counter and sampling facilities anabled

* Use lscpumf to check the facilities available for your hardware

* Start a benchmark (like mem_alloc) and execute perf top

* Canonical can only do regression testing, functional testing is currently only doable by IBM

[Regression Potential]

* There is as always some regression potential with having new code in and other code changed

* but this particular change is limited to the s390x architecture,

* again to the counter and sampling facilities, that need to be activated by intention

* and is only for compatibility with the latest and newest hw generation only (z15 and L1III)

[Other Info]

* The fix/patch got upstream accepted with v5.4-rc2, hence it needs to be applied to E, D and B

* The patch/commit neraly applied cleanly for me on E, D and B except a little conflict that is easily solveable

* or can even be even automatically be solved by cherry-pick-ing with '-X theirs'

* This is not relevant for Eoan GA, can be added with the help of an SRU cycle to Eoan post-GA

__________

Description:
s390/cpumsf: Check for CPU Measurement sampling

s390 IBM z15 introduces a check if the CPU Mesurement Facility
sampling is temporarily unavailable. If this is the case return -EBUSY
and abort the setup of CPU Measuement facility sampling.

Business Value:
With z15 the CPU Measurement sampling facility hardware may be in use when the Linux kernel CPU Measurement sampling facility device driver sets up sampling. This results in loss of sampling data and has to be avoided.

With z15 the CPU Measurement facility sampling hardware indicates being in use and the linux device driver can check for this situation and can abort any hardware access.

kernel 5.4
Git Commit: 932bfc5aae08f3cb20c1c9f051542f5933710151

bugproxy (bugproxy) on 2019-10-10
tags: added: architecture-s39064 bugnameltc-181842 severity-high targetmilestone-inin1910
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes) on 2019-10-10
Changed in ubuntu-z-systems:
status: New → Triaged
importance: Undecided → Medium

------- Comment From <email address hidden> 2019-10-10 10:08 EDT-------
This will also be requested for 19.04, and 18.04.
Patches will be provided if required for these dedicated kernels.

Frank Heimes (fheimes) wrote :

Waiting and setting to incomplete until patches/backports are shared for 19.04 and 18.04.

Changed in ubuntu-z-systems:
assignee: nobody → Frank Heimes (frank-heimes)
status: Triaged → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-10-14 03:34 EDT-------
commit 932bfc5aae08f3cb20c1c9f051542f5933710151
Author: Thomas Richter <email address hidden>
Date: Fri Sep 20 11:57:43 2019 +0200

s390/cpumsf: Check for CPU Measurement sampling
s390 IBM z15 introduces a check if the CPU Mesurement Facility
sampling is temporarily unavailable. If this is the case return -EBUSY
and abort the setup of CPU Measuement facility sampling.

This patch checks for concurrent sampling which has been introduced with z15.
If machine wide sample is currently active, do not start CPU Measurement facility sampling using the perf
tool. The z15 CPU Measurement facility hardware on z15 now has a bit set when machine wide
sampling is active. This bit can be queried and when it is set the CPU Measurement facility sampling
device driver does not allow CPU specific sample and the perf_event_open() system call returns
-EBUSY in this case

Before z15 this situation could not be detected.

Applied seamlessly on all requested kernels 4.15 , 5.0 and 5.3 ...

Frank Heimes (fheimes) on 2019-10-14
tags: added: s390x
Frank Heimes (fheimes) wrote :

Kernel SRU request submitted:
https://lists.ubuntu.com/archives/kernel-team/2019-October/thread.html#104715
changing status to 'In Progress'

Changed in linux (Ubuntu):
status: New → In Progress
Changed in ubuntu-z-systems:
status: Incomplete → In Progress
description: updated
information type: Private → Public
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Changed in linux (Ubuntu Disco):
status: New → Fix Committed
Changed in linux (Ubuntu Eoan):
status: In Progress → Fix Committed
Frank Heimes (fheimes) on 2019-10-17
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Frank Heimes (fheimes) wrote :

Completed testing (regression) on disco on z13 LPAR according to test case above - no issues found - updating tag.

tags: added: verification-done-disco
removed: verification-needed-disco
Frank Heimes (fheimes) wrote :

Completed testing (regression) on bionic on z13 LPAR according to test case above - no issues found - updating tag.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Frank Heimes (fheimes) wrote :

Completed testing (regression) on eoan on z13 LPAR according to test case above - no issues found - updating tag.

tags: added: verification-done-eoan
removed: verification-needed-eoan
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-10-28 05:24 EDT-------
Verified ok

All autopkgtests for the newly accepted linux-gcp-5.3 (5.3.0-1008.9~18.04.1) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

linux-gcp-5.3/unknown (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#linux-gcp-5.3

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Launchpad Janitor (janitor) wrote :
Download full text (53.1 KiB)

This bug was fixed in the package linux - 5.3.0-22.24

---------------
linux (5.3.0-22.24) eoan; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.3.0-21.22) eoan; urgency=medium

  * eoan/linux: 5.3.0-21.22 -proposed tracker (LP: #1850486)

  * Fix signing of staging modules in eoan (LP: #1850234)
    - [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink

linux (5.3.0-20.21) eoan; urgency=medium

  * eoan/linux: 5.3.0-20.21 -proposed tracker (LP: #1849064)

  * eoan: alsa/sof: Enable SOF_HDA link and codec (LP: #1848490)
    - [Config] Enable SOF_HDA link and codec

  * Eoan update: 5.3.7 upstream stable release (LP: #1848750)
    - panic: ensure preemption is disabled during panic()
    - [Config] updateconfigs for USB_RIO500
    - USB: rio500: Remove Rio 500 kernel driver
   ...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (38.6 KiB)

This bug was fixed in the package linux - 5.0.0-35.38

---------------
linux (5.0.0-35.38) disco; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - SAUCE: Fix revert "md/raid0: avoid RAID0 data corruption due to layout
      confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: Convert kvm_lock to a mutex
    - kvm: x86: Do not release the page inside mmu_set_spte()
    - KVM: x86: make FNAME(fetch) and __direct_map more similar
    - KVM: x86: remove now unneeded hugepage gfn adjustment
    - KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
    - KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - KVM: x86: use Intel speculation bugs and features as derived in generic x86
      code
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - SAUCE: x86/speculation/taa: Call tsx_init()
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.0.0-34.36) disco; urgency=medium

  * disco/linux: <version to be filled> -proposed tracker (LP: #1850574)

  * [REGRESSION] md/raid0: cannot as...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (24.0 KiB)

This bug was fixed in the package linux - 4.15.0-69.78

---------------
linux (4.15.0-69.78) bionic; urgency=medium

  * KVM NULL pointer deref (LP: #1851205)
    - KVM: nVMX: handle page fault in vmread fix

  * CVE-2018-12207
    - KVM: MMU: drop vcpu param in gpte_access
    - kvm: Convert kvm_lock to a mutex
    - kvm: x86: Do not release the page inside mmu_set_spte()
    - KVM: x86: make FNAME(fetch) and __direct_map more similar
    - KVM: x86: remove now unneeded hugepage gfn adjustment
    - KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
    - KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - KVM: x86: use Intel speculation bugs and features as derived in generic x86
      code
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - SAUCE: x86/speculation/taa: Call tsx_init()
    - SAUCE: x86/cpu: Include cpu header from bugs.c
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - drm/i915/gtt: Add read only pages to gen8_pte_encode
    - drm/i915/gtt: Read-only pages for insert_entries on bdw+
    - drm/i915/gtt: Disable read-only support under GVT
    - drm/i915: Prevent writing into a read-only object via a GGTT mmap
    - drm/i915/cmdparser: Check reg_table_count before derefencing.
    - drm/i915/cmdparser: Do not check past the cmd length.
    - drm/i915: Silence smatch for cmdparser
    - drm/i915: Move engine->needs_cmd_parser to engine->flags
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdpar...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (33.2 KiB)

This bug was fixed in the package linux - 5.3.0-24.26

---------------
linux (5.3.0-24.26) eoan; urgency=medium

  * eoan/linux: 5.3.0-24.26 -proposed tracker (LP: #1852232)

  * Eoan update: 5.3.9 upstream stable release (LP: #1851550)
    - io_uring: fix up O_NONBLOCK handling for sockets
    - dm snapshot: introduce account_start_copy() and account_end_copy()
    - dm snapshot: rework COW throttling to fix deadlock
    - Btrfs: fix inode cache block reserve leak on failure to allocate data space
    - btrfs: qgroup: Always free PREALLOC META reserve in
      btrfs_delalloc_release_extents()
    - iio: adc: meson_saradc: Fix memory allocation order
    - iio: fix center temperature of bmc150-accel-core
    - libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature
    - perf tests: Avoid raising SEGV using an obvious NULL dereference
    - perf map: Fix overlapped map handling
    - perf script brstackinsn: Fix recovery from LBR/binary mismatch
    - perf jevents: Fix period for Intel fixed counters
    - perf tools: Propagate get_cpuid() error
    - perf annotate: Propagate perf_env__arch() error
    - perf annotate: Fix the signedness of failure returns
    - perf annotate: Propagate the symbol__annotate() error return
    - perf annotate: Fix arch specific ->init() failure errors
    - perf annotate: Return appropriate error code for allocation failures
    - perf annotate: Don't return -1 for error when doing BPF disassembly
    - staging: rtl8188eu: fix null dereference when kzalloc fails
    - RDMA/siw: Fix serialization issue in write_space()
    - RDMA/hfi1: Prevent memory leak in sdma_init
    - RDMA/iw_cxgb4: fix SRQ access from dump_qp()
    - RDMA/iwcm: Fix a lock inversion issue
    - HID: hyperv: Use in-place iterator API in the channel callback
    - kselftest: exclude failed TARGETS from runlist
    - selftests/kselftest/runner.sh: Add 45 second timeout per test
    - nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
    - arm64: cpufeature: Effectively expose FRINT capability to userspace
    - arm64: Fix incorrect irqflag restore for priority masking for compat
    - arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
    - tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
    - tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
    - serial/sifive: select SERIAL_EARLYCON
    - tty: n_hdlc: fix build on SPARC
    - misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
    - RDMA/core: Fix an error handling path in 'res_get_common_doit()'
    - RDMA/cm: Fix memory leak in cm_add/remove_one
    - RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
    - RDMA/mlx5: Do not allow rereg of a ODP MR
    - RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
    - RDMA/mlx5: Add missing synchronize_srcu() for MW cases
    - gpio: max77620: Use correct unit for debounce times
    - fs: cifs: mute -Wunused-const-variable message
    - arm64: vdso32: Fix broken compat vDSO build warnings
    - arm64: vdso32: Detect binutils support for dmb ishld
    - serial: mctrl_gpio: Check for NULL pointer
    - serial: 8250_...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Frank Heimes (fheimes) on 2019-12-06
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released

------- Comment From <email address hidden> 2019-12-09 06:52 EDT-------
IBM Bugzilla -> closed, Fix Released with all requested distros

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers