guest experiencing Transmit Timeouts on CX4

Bug #1636330 reported by bugproxy
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Yakkety
Fix Released
Undecided
Tim Gardner
Zesty
Fix Released
Undecided
Unassigned

Bug Description

This patch fixes a race condition that was reintroduced in the 4.8 kernel, as part of the P9 changes, after having been originally fixed in 3.19. The effect of the race condition is that a secondary thread can start trying to execute code from the guest while the thread is still in hypervisor mode, so it can cause many different symptoms, one of which seems to be the timebase corruption that leads to the lockup in ktime_get_ts64. It could cause other problems such as CPU cores locking up hard or memory corruption.

This patch is only needed on the host. It should be applied to any 4.8 kernel being used as a host, including the Ubuntu 16.10 kernel.

Hi Paul
I built a kernel with your patch and put at the host and in the guest. I can still see some ktime_get_ts64 at the host. The guest dmesg is clean.

This patch fixes another race condition I found in the fastsleep code. Please apply this patch as well and test.

Sure will do and provide feedback.

CVE References

Revision history for this message
bugproxy (bugproxy) wrote : Patch to fix race condition (again!)

Default Comment by Bridge

tags: added: architecture-ppc64 bugnameltc-132390 severity-critical targetmilestone-inin1610
Revision history for this message
bugproxy (bugproxy) wrote : Patch to fix another race condition in fastsleep code

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Breno Leitão (breno-leitao) wrote :

Was the attached patch accepted upstream?

Changed in linux (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-10-26 22:34 EDT-------
(In reply to comment #150)
> Was the attached patch accepted upstream?

There were two patches, and they were both accepted upstream. Commit IDs are 56c46222af0d09149fadec2a3ce9d4 and 09b7e37b18eecc1e347f4b1a3bc863.

Revision history for this message
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Zesty):
status: New → Fix Released
assignee: Canonical Kernel Team (canonical-kernel-team) → nobody
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-11-07 23:35 EDT-------
cde00 (<email address hidden>) deleted native attachment kern.log.powerio-le12.oct12 on 2016-11-07 22:24:41

cde00 (<email address hidden>) added native attachment /tmp/AIXOS05483833/kern.log.powerio-le12.oct12 on 2016-11-07 22:24:44

Revision history for this message
Tim Gardner (timg-tpi) wrote :
Luis Henriques (henrix)
Changed in linux (Ubuntu Yakkety):
status: In Progress → Fix Committed
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-11-11 11:23 EDT-------
*** Bug 146681 has been marked as a duplicate of this bug. ***

Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
Revision history for this message
bugproxy (bugproxy) wrote : CX4 dmesg

Default Comment by Bridge

tags: removed: verification-needed-yakkety
Revision history for this message
bugproxy (bugproxy) wrote : kern.log from guest from yesterday failure.

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : dmesg with command timeout but no ktime_get trace

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : dmesg from guest and kvm

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : bug 147644 data

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : dmesg with Paul's patch

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Patch from Zhong Li to fix some races in XICS emulation

------- Comment on attachment From <email address hidden> 2016-11-16 16:48 EDT-------

This is the patch set from Zhong Li to fix some races in the XICS emulation code. It would be useful to know if this solves the missing interrupt problem. Since it seems that the problem only shows up when the interrupt affinity is being changed, it would be useful to test with the affinity being changed very frequently, so as to trigger the problem more quickly.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-11-16 17:01 EDT-------
cde00 (<email address hidden>) added native attachment /tmp/AIXOS05483833/irq-fix.patch on 2016-11-16 15:58:33

Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
bugproxy (bugproxy)
tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (26.6 KiB)

This bug was fixed in the package linux - 4.8.0-28.30

---------------
linux (4.8.0-28.30) yakkety; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1641083

  * lxc-attach to malicious container allows access to host (LP: #1639345)
    - Revert "UBUNTU: SAUCE: (noup) ptrace: being capable wrt a process requires
      mapped uids/gids"
    - (upstream) mm: Add a user_ns owner to mm_struct and fix ptrace permission
      checks

  * [Feature] AVX-512 new instruction sets (avx512_4vnniw, avx512_4fmaps)
    (LP: #1637526)
    - x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features

  * zfs: importing zpool with vdev on zvol hangs kernel (LP: #1636517)
    - SAUCE: (noup) Update zfs to 0.6.5.8-0ubuntu4.1

  * Move some device drivers build from kernel built-in to modules
    (LP: #1637303)
    - [Config] CONFIG_TIGON3=m for all arches
    - [Config] CONFIG_VIRTIO_BLK=m, CONFIG_VIRTIO_NET=m

  * I2C touchpad does not work on AMD platform (LP: #1612006)
    - pinctrl/amd: Configure GPIO register using BIOS settings

  * guest experiencing Transmit Timeouts on CX4 (LP: #1636330)
    - powerpc/64: Re-fix race condition between going idle and entering guest
    - powerpc/64: Fix race condition in setting lock bit in idle/wakeup code

  * QEMU throws failure msg while booting guest with SRIOV VF (LP: #1630554)
    - KVM: PPC: Always select KVM_VFIO, plus Makefile cleanup

  * [Feature] KBL - New device ID for Kabypoint(KbP) (LP: #1591618)
    - SAUCE: mfd: lpss: Fix Intel Kaby Lake PCH-H properties

  * hio: SSD data corruption under stress test (LP: #1638700)
    - SAUCE: hio: set bi_error field to signal an I/O error on a BIO
    - SAUCE: hio: splitting bio in the entry of .make_request_fn

  * cleanup primary tree for linux-hwe layering issues (LP: #1637473)
    - [Config] switch Vcs-Git: to yakkety repository
    - [Packaging] handle both linux-lts* and linux-hwe* as backports
    - [Config] linux-tools-common and linux-cloud-tools-common are one per series
    - [Config] linux-source-* is in the primary linux namespace
    - [Config] linux-tools -- always suggest the base package

  * SRU: sync zfsutils-linux and spl-linux changes to linux (LP: #1635656)
    - SAUCE: (noup) Update spl to 0.6.5.8-2, zfs to 0.6.5.8-0ubuntu4 (LP:
      #1635656)

  * [Feature] SKX: perf uncore PMU support (LP: #1591810)
    - perf/x86/intel/uncore: Add Skylake server uncore support
    - perf/x86/intel/uncore: Remove hard-coded implementation for Node ID mapping
      location
    - perf/x86/intel/uncore: Handle non-standard counter offset

  * [Feature] Purley: Memory Protection Keys (LP: #1591804)
    - x86/pkeys: Add fault handling for PF_PK page fault bit
    - mm: Implement new pkey_mprotect() system call
    - x86/pkeys: Make mprotect_key() mask off additional vm_flags
    - x86/pkeys: Allocation/free syscalls
    - x86: Wire up protection keys system calls
    - generic syscalls: Wire up memory protection keys syscalls
    - pkeys: Add details of system call use to Documentation/
    - x86/pkeys: Default to a restrictive init PKRU
    - x86/pkeys: Allow configuration of init_pkru
    - x86/pkeys: Add self-tests

  * kernel invalid ...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Update Released

The verification of the Stable Release Update for linux has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.