arm64: guest hangs when ntpd is running

Bug #1549494 reported by Andy Whitcroft
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
dann frazier
Xenial
Fix Released
Medium
dann frazier

Bug Description

SRU Justification:

Impact: KVM guests hang on arm64 host running ntpd

Fix: From linux-next:
commit 1c5631c73fc2261a5df64a72c155cb53dcdc0c45
Author: Marc Zyngier <email address hidden>
Date: Wed Apr 6 09:37:22 2016 +0100

    KVM: arm/arm64: Handle forward time correction gracefully

    BugLink: http://bugs.launchpad.net/bugs/1549494

    On a host that runs NTP, corrections can have a direct impact on
    the background timer that we program on the behalf of a vcpu.

    In particular, NTP performing a forward correction will result in
    a timer expiring sooner than expected from a guest point of view.
    Not a big deal, we kick the vcpu anyway.

    But on wake-up, the vcpu thread is going to perform a check to
    find out whether or not it should block. And at that point, the
    timer check is going to say "timer has not expired yet, go back
    to sleep". This results in the timer event being lost forever.

    There are multiple ways to handle this. One would be record that
    the timer has expired and let kvm_cpu_has_pending_timer return
    true in that case, but that would be fairly invasive. Another is
    to check for the "short sleep" condition in the hrtimer callback,
    and restart the timer for the remaining time when the condition
    is detected.

    This patch implements the latter, with a bit of refactoring in
    order to avoid too much code duplication.

Testcase:
A) Install ntp on the host (I'm using a 2 socket ThunderX)
B) Boot up a large VM (95 CPUs/4G memory)
C) Within the VM, rebuild the Ubuntu kernel package
If it doesn't hang - bug is gone.

CVE References

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Andy Whitcroft (apw)
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (7.6 KiB)

This bug was fixed in the package linux - 4.4.0-9.24

---------------
linux (4.4.0-9.24) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1551319

  * AppArmor logs denial for when the device path is ENOENT (LP: #1482943)
    - SAUCE: apparmor: fix log of apparmor audit message when kern_path() fails

  * BUG: unable to handle kernel NULL pointer dereference (aa_label_merge) (LP:
    #1448912)
    - SAUCE: apparmor: Fix: insert race between label_update and label_merge
    - SAUCE: apparmor: Fix: ensure aa_get_newest will trip debugging if the
      replacedby is not setup
    - SAUCE: apparmor: Fix: label merge handling of marking unconfined and stale
    - SAUCE: apparmor: Fix: refcount race between locating in labelset and get
    - SAUCE: apparmor: Fix: ensure new labels resulting from merge have a
      replacedby
    - SAUCE: apparmor: Fix: label_vec_merge insertion
    - SAUCE: apparmor: Fix: deadlock in aa_put_label() call chain
    - SAUCE: apparmor: Fix: add required locking of __aa_update_replacedby on
      merge path
    - SAUCE: apparmor: Fix: convert replacedby update to be protected by the
      labelset lock
    - SAUCE: apparmor: Fix: update replacedby allocation to take a gfp parameter

  * apparmor kernel BUG kills firefox (LP: #1430546)
    - SAUCE: apparmor: Disallow update of cred when then subjective != the
      objective cred
    - SAUCE: apparmor: rework retrieval of the current label in the profile update
      case

  * sleep from invalid context in aa_move_mount (LP: #1539349)
    - SAUCE: apparmor: fix sleep from invalid context

  * s390x: correct restore of high gprs on signal return (LP: #1550468)
    - s390/compat: correct restore of high gprs on signal return

  * missing SMAP support (LP: #1550517)
    - x86/entry/compat: Add missing CLAC to entry_INT80_32

  * Floating-point exception handler receives empty Data-Exception Code in
    Floating Point Control register (LP: #1548414)
    - s390/fpu: signals vs. floating point control register

  * kvm fails to boot GNU Hurd kernels with 4.4 Xenial kernel (LP: #1550596)
    - KVM: x86: fix conversion of addresses to linear in 32-bit protected mode

  * Surelock GA2 SP1: capiredp01: cxl_init_adapter fails for CAPI devices
    0000:01:00.0 and 0005:01:00.0 after upgrading to 840.10 Platform firmware
    build fips840/b1208b_1604.840 (LP: #1532914)
    - cxl: Fix PSL timebase synchronization detection

  * [Feature]EDAC support for Knights Landing (LP: #1519631)
    - EDAC, sb_edac: Set fixed DIMM width on Xeon Knights Landing

  * Various failures of kernel_security suite on Xenial kernel on s390x arch
    (LP: #1531327)
    - [config] s390x -- CONFIG_DEFAULT_MMAP_MIN_ADDR=65536

  * Unable to install VirtualBox Guest Service in 15.04 (LP: #1434579)
    - [Config] Provides: virtualbox-guest-modules when appropriate

  * linux is missing provides for virtualbox-guest-modules [i386 amd64 x32] (LP:
    #1507588)
    - [Config] Provides: virtualbox-guest-modules when appropriate

  * Backport more recent driver for SKL, KBL and BXT graphics (LP: #1540390)
    - SAUCE: i915_bpo: Provide a backport driver for SKL, KBL & BXT graphics
    - SA...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :

Reopening, as I've requested these patches be reverted:
https://lists.ubuntu.com/archives/kernel-team/2016-March/074217.html

Changed in linux (Ubuntu Xenial):
assignee: Andy Whitcroft (apw) → dann frazier (dannf)
status: Fix Released → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.2 KiB)

This bug was fixed in the package linux - 4.4.0-16.32

---------------
linux (4.4.0-16.32) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1561727

  * fix thermal throttling due to commit "Thermal: initialize thermal zone
    device correctly" (LP: #1561676)
    - Thermal: Ignore invalid trip points

  * Thinkpad T460: Trackpoint mouse buttons instantly generate "release" event
    on press (LP: #1553811)
    - SAUCE: (noup) Input: synaptics - handle spurious release of trackstick
      buttons, again

  * reading /sys/kernel/security/apparmor/profiles requires CAP_MAC_ADMIN
    (LP: #1560583)
    - SAUCE: apparmor: Allow ns_root processes to open profiles file
    - SAUCE: apparmor: Consult sysctl when reading profiles in a user ns

  * linux: sync virtualbox drivers to 5.0.16-dfsg-2 (LP: #1561492)
    - ubuntu: vbox -- update to 5.0.16-dfsg-2

  * s390/kconfig: CONFIG_NUMA without CONFIG_NUMA_EMU does not make any sense on
    s390x (LP: #1557690)
    - [Config] CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=n for s390x

  * spl/zfs fails to build on s390x (LP: #1519814)
    - [Config] s390x -- re-enable zfs
    - [Config] zfs -- disable powerpc until the test failures can be resolved

  * linux: sync to ZFS 0.6.5.6 stable release (LP: #1561483)
    - SAUCE: (noup) Update spl to 0.6.5.6-0ubuntu1, zfs to 0.6.5.6-0ubuntu1

  * zfs: enable zfs for 64bit powerpc kernels (LP: #1558871)
    - [Packaging] zfs -- handle rprovides via dpkg-gencontrol
    - [Config] powerpc -- convert zfs configuration to custom_override

  * Memory arena corruption with FUSE (was Memory allocation failure crashes
    kernel hard, presumably related to FUSE) (LP: #1505948)
    - SAUCE: (noup) fuse: do not use iocb after it may have been freed
    - SAUCE: (noup) fuse: Add reference counting for fuse_io_priv

  * cgroup namespaces: add a 'nsroot=' mountinfo field (LP: #1560489)
    - SAUCE: (noup) cgroup namespaces: add a 'nsroot=' mountinfo field

  * linux packaging: clear remaining redundant delta (LP: #1560445)
    - [Debian] Remove generated intermediate files on clean

  * arm64: guest hangs when ntpd is running (LP: #1549494)
    - Revert "hrtimer: Add support for CLOCK_MONOTONIC_RAW"
    - Revert "hrtimer: Catch illegal clockids"
    - Revert "KVM: arm/arm64: timer: Switch to CLOCK_MONOTONIC_RAW"

  * Need enough contiguous memory to support GICv3 ITS table (LP: #1558828)
    - [Config] CONFIG_FORCE_MAX_ZONEORDER=13 on arm64
    - SAUCE: (no-up) arm64: gicv3: its: Increase FORCE_MAX_ZONEORDER for Cavium
      ThunderX

  * update arcmsr to version v1.30.00.22-20151126 to fix card timeouts
    (LP: #1559609)
    - arcmsr: fixed getting wrong configuration data
    - arcmsr: fixes not release allocated resource
    - arcmsr: make code more readable
    - arcmsr: adds code to support new Areca adapter ARC1203
    - arcmsr: changes driver version number
    - arcmsr: more readability improvements
    - arcmsr: Split dma resource allocation to a new function
    - arcmsr: change driver version to v1.30.00.22-20151126

  * server image has no keyboard, desktop image works (LP: #1559692)
    - [Config] Rework input-modules (d-i) list

  * PMU sup...

Read more...

Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :

We reverted these patches, so reopening. Awaiting a proper upstream solution.

Changed in linux (Ubuntu Xenial):
status: Fix Released → Confirmed
dann frazier (dannf)
Changed in linux (Ubuntu Xenial):
status: Confirmed → In Progress
dann frazier (dannf)
description: updated
dann frazier (dannf)
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.5 KiB)

This bug was fixed in the package linux - 4.4.0-21.37

---------------
linux (4.4.0-21.37) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1571791

  * linux: MokSBState is ignored (LP: #1571691)
    - SAUCE: (noup) MODSIGN: Import certificates from UEFI Secure Boot
    - SAUCE: (noup) efi: Disable secure boot if shim is in insecure mode
    - SAUCE: (noup) Display MOKSBState when disabled

linux (4.4.0-20.36) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1571069

  * sysfs mount failure during stateful lxd snapshots (LP: #1570906)
    - SAUCE: kernfs: Do not match superblock in another user namespace when
      mounting

  * Kernel Panic in Ubuntu 16.04 netboot installer (LP: #1570441)
    - x86/topology: Fix logical package mapping
    - x86/topology: Fix Intel HT disable
    - x86/topology: Use total_cpus not nr_cpu_ids for logical packages
    - xen/apic: Provide Xen-specific version of cpu_present_to_apicid APIC op
    - x86/topology: Fix AMD core count

  * [regression]: Failed to call clock_adjtime(): Invalid argument
    (LP: #1566465)
    - ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO

linux (4.4.0-19.35) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1570348

  * CVE-2016-2847 (LP: #1554260)
    - pipe: limit the per-user amount of pages allocated in pipes

  * xenial kernel crash on HP BL460c G7 (qla24xx problem?) (LP: #1554003)
    - SAUCE: (noup) qla2xxx: Add irq affinity notification V2

  * arm64: guest hangs when ntpd is running (LP: #1549494)
    - SAUCE: (noup) KVM: arm/arm64: Handle forward time correction gracefully

  * linux: Enforce signed module loading when UEFI secure boot (LP: #1566221)
    - [Config] CONFIG_EFI_SECURE_BOOT_SIG_ENFORCE=y

  * s390/cpumf: Fix lpp detection (LP: #1555344)
    - s390/facilities: use stfl mnemonic instead of insn magic
    - s390/facilities: always use lowcore's stfle field for storing facility bits
    - s390/cpumf: Fix lpp detection

  * s390x kernel image needs weightwatchers (LP: #1536245)
    - [Config] s390x: Use compressed kernel bzImage

  * Surelock GA2 SP1: surelock02p05: Not seeing sgX devices for LUNs after
    upgrading to Ubuntu 16.04 (LP: #1567581)
    - Revert "UBUNTU: SAUCE: (noup) powerpc/pci: Assign fixed PHB number based on
      device-tree properties"

  * Backport upstream bugfixes to ubuntu-16.04 (LP: #1555765)
    - cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path
    - Revert "cpufreq: postfix policy directory with the first CPU in related_cpus"
    - cpufreq: powernv: Add sysfs attributes to show throttle stats

  * systemd-modules-load.service: Failing due to missing module 'ib_iser' (LP: #1566468)
    - [Config] Add ib_iser to generic inclusion list

  * thunderx nic performance improvements (LP: #1567093)
    - net: thunderx: Set recevie buffer page usage count in bulk
    - net: thunderx: Adjust nicvf structure to reduce cache misses

  * fixes for thunderx nic in multiqueue mode (LP: #1567091)
    - net: thunderx: Fix for multiqset not configured upon interface toggle
    - net: thunderx: Fix for HW TSO not enabled for secondary qsets
    - net: thund...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.