[Ubuntu] kvm: fix deadlock when killed by oom

Bug #1800849 reported by bugproxy on 2018-10-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
High
Canonical Kernel Team
linux (Ubuntu)
High
Joseph Salisbury
Xenial
High
Joseph Salisbury
Bionic
High
Joseph Salisbury
Cosmic
High
Joseph Salisbury

Bug Description

== SRU Justification ==

Description: kvm: fix deadlock when killed by oom
Symptom: oom killer leaves processes in a deadlock state.
Problem: The problem arises in the rare cases in which
         handle_mm_fault does not release the mm_sem.
Solution: Correct the issue by manually release the mm_sem when needed.

== Fix ==

306d6c49ac9ded11114cb53b0925da52f2c2ada1 ("s390/kvm: fix deadlock when killed by oom")

== Patch ==

commit 306d6c49ac9ded11114cb53b0925da52f2c2ada1
Author: Claudio Imbrenda <email address hidden>
Date: Mon Jul 16 10:38:57 2018 +0200

    s390/kvm: fix deadlock when killed by oom

    When the oom killer kills a userspace process in the page fault handler
    while in guest context, the fault handler fails to release the mm_sem
    if the FAULT_FLAG_RETRY_NOWAIT option is set. This leads to a deadlock
    when tearing down the mm when the process terminates. This bug can only
    happen when pfault is enabled, so only KVM clients are affected.

    The problem arises in the rare cases in which handle_mm_fault does not
    release the mm_sem. This patch fixes the issue by manually releasing
    the mm_sem when needed.

    Fixes: 24eb3a824c4f3 ("KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault")
    Cc: <email address hidden> # 3.15+
    Signed-off-by: Claudio Imbrenda <email address hidden>
    Signed-off-by: Martin Schwidefsky <email address hidden>

== Regression Potential ==

Low and minimal, because:

- code change is s390x only
- limited to one single file: /arch/s390/mm/fault.c
- just two additional lines added (if stmt)
- Xenial and Cosmic already have this commit via upstream stable updates.
- Hence patch is just missing in Bionic.
- Test kernel was build for testting.

== Test Case ==

Create numerous KVM guests so that the host starts swapping
and memory becomes overcomitted and the oom killer is triggered.
__________

Description: kvm: fix deadlock when killed by oom

Symptom: oom killer leaves processes in a deadlock state.

Problem: The problem arises in the rare cases in which
                  handle_mm_fault does not release the mm_sem.

Solution: Correct the issue by manually relaese the mm_sem
                  when needed.

Reproduction: Create numerous KVM guests so that the host starts
                  swapping and memory becomes overcomitted and the oom
                  killer is triggered.

kernel 4.19
Upstream-ID: 306d6c49ac9ded11114cb53b0925da52f2c2ada1

CVE References

bugproxy (bugproxy) on 2018-10-31
tags: added: architecture-s39064 bugnameltc-172752 severity-high targetmilestone-inin1810
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
status: New → Triaged

------- Comment From <email address hidden> 2018-10-31 10:23 EDT-------
Also to be applied to all releases in Service.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-31 10:40 EDT-------
After checking , please provide the Fix for Xenial, Bionic and Cosmic.

Changed in linux (Ubuntu):
importance: Undecided → High
status: New → Triaged
assignee: Skipper Bug Screeners (skipper-screen-team) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
status: New → Triaged
Changed in linux (Ubuntu Bionic):
status: New → Triaged
Changed in linux (Ubuntu Cosmic):
status: New → Triaged
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Cosmic):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
status: Triaged → Fix Released
Joseph Salisbury (jsalisbury) wrote :

Xenial and Cosmic already have commit 306d6c49ac9 via upstream stable updates.

I built a Bionic test kernel with commit 306d6c49ac9. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1800849

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Changed in linux (Ubuntu Xenial):
status: Triaged → Fix Released
Changed in linux (Ubuntu Bionic):
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: Triaged → Fix Released
Changed in ubuntu-z-systems:
status: Triaged → Fix Committed
description: updated
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-05 03:58 EDT-------
Fix verified upfront by IBM

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Frank Heimes (frank-heimes) wrote :

Was already tested by IBM (according to comment #5) - adjusting the tags accordingly now.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (3.1 KiB)

This bug was fixed in the package linux - 4.15.0-42.45

---------------
linux (4.15.0-42.45) bionic; urgency=medium

  * linux: 4.15.0-42.45 -proposed tracker (LP: #1803592)

  * [FEAT] Guest-dedicated Crypto Adapters (LP: #1787405)
    - KVM: s390: reset crypto attributes for all vcpus
    - KVM: s390: vsie: simulate VCPU SIE entry/exit
    - KVM: s390: introduce and use KVM_REQ_VSIE_RESTART
    - KVM: s390: refactor crypto initialization
    - s390: vfio-ap: base implementation of VFIO AP device driver
    - s390: vfio-ap: register matrix device with VFIO mdev framework
    - s390: vfio-ap: sysfs interfaces to configure adapters
    - s390: vfio-ap: sysfs interfaces to configure domains
    - s390: vfio-ap: sysfs interfaces to configure control domains
    - s390: vfio-ap: sysfs interface to view matrix mdev matrix
    - KVM: s390: interface to clear CRYCB masks
    - s390: vfio-ap: implement mediated device open callback
    - s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl
    - s390: vfio-ap: zeroize the AP queues
    - s390: vfio-ap: implement VFIO_DEVICE_RESET ioctl
    - KVM: s390: Clear Crypto Control Block when using vSIE
    - KVM: s390: vsie: Do the CRYCB validation first
    - KVM: s390: vsie: Make use of CRYCB FORMAT2 clear
    - KVM: s390: vsie: Allow CRYCB FORMAT-2
    - KVM: s390: vsie: allow CRYCB FORMAT-1
    - KVM: s390: vsie: allow CRYCB FORMAT-0
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-1
    - KVM: s390: vsie: allow guest FORMAT-1 CRYCB on host FORMAT-2
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-2
    - KVM: s390: device attrs to enable/disable AP interpretation
    - KVM: s390: CPU model support for AP virtualization
    - s390: doc: detailed specifications for AP virtualization
    - KVM: s390: fix locking for crypto setting error path
    - KVM: s390: Tracing APCB changes
    - s390: vfio-ap: setup APCB mask using KVM dedicated function
    - s390/zcrypt: Add ZAPQ inline function.
    - s390/zcrypt: Review inline assembler constraints.
    - s390/zcrypt: Integrate ap_asm.h into include/asm/ap.h.
    - s390/zcrypt: fix ap_instructions_available() returncodes
    - s390/zcrypt: remove VLA usage from the AP bus
    - s390/zcrypt: Remove deprecated ioctls.
    - s390/zcrypt: Remove deprecated zcrypt proc interface.
    - s390/zcrypt: Support up to 256 crypto adapters.
    - [Config:] Enable CONFIG_S390_AP_IOMMU and set CONFIG_VFIO_AP to module.

  * Bypass of mount visibility through userns + mount propagation (LP: #1789161)
    - mount: Retest MNT_LOCKED in do_umount
    - mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

  * CVE-2018-18955: nested user namespaces with more than five extents
    incorrectly grant privileges over inode (LP: #1801924) // CVE-2018-18955
    - userns: also map extents in the reverse map to kernel IDs

  * kdump fail due to an IRQ storm (LP: #1797990)
    - SAUCE: x86/PCI: Export find_cap() to be used in early PCI code
    - SAUCE: x86/quirks: Add parameter to clear MSIs early on boot
    - SAUCE: x86/quirks: Scan all busses for early PCI quirks

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 15 Nov 2018 17:01:46 ...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-03 11:02 EDT-------
IBM Bugzilla Status -> closed, Fix Released by Xenial, Bionic, Cosmic

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers