Processes in "D" state due to zap_pid_ns_processes kernel call with Ubuntu + Docker

Bug #1698264 reported by Thiago Alves Silva on 2017-06-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Seth Forshee
Xenial
Medium
Seth Forshee
Yakkety
Medium
Seth Forshee
Zesty
Medium
Seth Forshee

Bug Description

SRU Justification

Impact: In some cases some docker processes can be stuck in the D state after a container has terminated. They will remain in this state until reboot.

Fix: Cherry pick upstream commit b9a985db98961ae1ba0be169f19df1c567e4ffe0, which has already been included as a stable commit in maintained upstream stable kernels.

Test case: See below.

Regression potential: Low, this is a simple change and as stated above the patch has already been shipped out in upstream stable kernels.

---

(please refer to https://github.com/moby/moby/issues/31007#issuecomment-308877825 for context)

Precondition: Ubuntu 16.04.2 with Docker 17.03 (kernel 4.4)

Steps to reproduce:
- Install latest Docker
- Run 300 containers with health check (for i in {1..300}; do docker run -d -it --restart=always --name poc_$i talves/health_poc; done)
- Send termination signal to the containers (docker kill -s TERM $(docker ps -q)
- A few processes are going to be stuck in "uninterruptible sleep" ("D" state). The only know way to recover from this is host reboot

Expected behavior:
- All containers should be terminated without any dangling process

Actual behavior:
- Some processes are left in "D" state. In our production environment this leads over time to performance degradation and maintenance issues due to containers that cannot be stopped / removed.

A fix is provided on kernel 4.12 - it would be nice if it could be backported and included in the next Ubuntu release within the supported kernel.

Thanks in advance
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 29 16:54 seq
 crw-rw---- 1 root audio 116, 33 May 29 16:54 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse:
 Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Cannot stat file /proc/11652/fd/4: Stale file handle
 Cannot stat file /proc/11652/fd/5: Stale file handle
 Cannot stat file /proc/11652/fd/6: Stale file handle
 Cannot stat file /proc/11652/fd/7: Stale file handle
 Cannot stat file /proc/11652/fd/11: Stale file handle
DistroRelease: Ubuntu 16.04
Ec2AMI: ami-45b69e52
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: t2.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: Xen HVM domU
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 cirrusdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-78-generic root=UUID=9b05a884-ac72-4bd2-8660-3bfa5cb22246 ro net.ifnames=0 biosdevname=0 cgroup_enable=memory swapaccount=1 console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 4.4.0-78.99-generic 4.4.62
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-78-generic N/A
 linux-backports-modules-4.4.0-78-generic N/A
 linux-firmware 1.157.10
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial ec2-images
Uname: Linux 4.4.0-78-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 02/16/2017
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd02/16/2017:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1698264

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected ec2-images xenial
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Seth Forshee (sforshee) on 2017-06-16
Changed in linux (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Seth Forshee (sforshee)
Seth Forshee (sforshee) wrote :

I've tried reproducing this but haven't had luck. Something about my test environment must make it difficult to hit the issue.

As I indicated on github, the xenial kernel currently in -proposed (4.4.0-80.101) contains the patch which you indicated fixes the issue. Could you test this kernel to confirm it is fixed there?

https://wiki.ubuntu.com/Testing/EnableProposed

I'll work on getting the fix into the yakkety/zesty kernels as well. Thanks!

Changed in linux (Ubuntu):
status: Confirmed → Incomplete

Seth, I was able to try the Kernel provided by you (4.4.0-1019-aws) and it worked fine.

Do you believe we will be able to have it in a supported Ubuntu kernel on release 16.04.3?

Thank you so much!

Seth Forshee (sforshee) wrote :

Yes, it should definitely be fixed before we release 16.04.3.

Changed in linux (Ubuntu Xenial):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Medium
status: New → Fix Committed
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu Zesty):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu):
status: Incomplete → Fix Committed
Seth Forshee (sforshee) on 2017-06-19
description: updated
Stefan Bader (smb) on 2017-06-21
Changed in linux (Ubuntu Yakkety):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
tags: added: verification-needed-zesty

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Marked Xenial series as 'Fix Released' since the fix, commit b9a985db98961ae1ba0be169f19df1c567e4ffe0 upstream (pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes), has been included as part of update to 4.4.70 stable release (bug #1694621).

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released

Given the nature of the issue, which is difficult to reproduce, and the fact that the fix is upstream, I'm changing the tags to verification-done-{yakkety,zesty} so we don't need to verify it on these two series.

tags: added: verification-done-yakkety verification-done-zesty
removed: verification-needed-yakkety verification-needed-zesty
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.8.0-59.64

---------------
linux (4.8.0-59.64) yakkety; urgency=low

  * linux: 4.8.0-59.64 -proposed tracker (LP: #1701019)

  * KILLER1435-S[0489:e0a2] BT cannot search BT 4.0 device (LP: #1699651)
    - Bluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device

  * CVE-2017-7895
    - nfsd4: minor NFSv2/v3 write decoding cleanup
    - nfsd: stricter decoding of write-like NFSv2/v3 ops

  * CVE-2017-5551
    - tmpfs: clear S_ISGID when setting posix ACLs

  * CVE-2017-9605
    - drm/vmwgfx: Make sure backup_handle is always valid

  * CVE-2017-1000380
    - ALSA: timer: Fix race between read and ioctl
    - ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT

  * CVE-2017-9150
    - bpf: don't let ldimm64 leak map addresses on unprivileged

  * CVE-2017-5576
    - drm/vc4: Fix an integer overflow in temporary allocation layout.

  * Processes in "D" state due to zap_pid_ns_processes kernel call with Ubuntu +
    Docker (LP: #1698264)
    - pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes

  * CVE-2016-9755
    - netfilter: ipv6: nf_defrag: drop mangled skb on ream error

  * CVE-2017-7346
    - drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()

  * CVE-2017-8924
    - USB: serial: io_ti: fix information leak in completion handler

  * CVE-2017-8925
    - USB: serial: omninet: fix reference leaks at open

  * CVE-2017-9074
    - ipv6: Check ip6_find_1stfragopt() return value properly.

  * CVE-2014-9900
    - net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol()

  * OpenPower: Some multipaths temporarily have only a single path
    (LP: #1696445)
    - scsi: ses: don't get power status of SES device slot on probe

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 29 Jun 2017 14:34:32 -0300

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (8.1 KiB)

This bug was fixed in the package linux - 4.10.0-28.32

---------------
linux (4.10.0-28.32) zesty; urgency=low

  * linux: 4.10.0-28.32 -proposed tracker (LP: #1701013)

  * KILLER1435-S[0489:e0a2] BT cannot search BT 4.0 device (LP: #1699651)
    - Bluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device

  * aacraid driver may return uninitialized stack data to userspace
    (LP: #1700077)
    - SAUCE: scsi: aacraid: Don't copy uninitialized stack memory to userspace

  * CVE-2017-9605
    - drm/vmwgfx: Make sure backup_handle is always valid

  * CVE-2017-1000380
    - ALSA: timer: Fix race between read and ioctl
    - ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT

  * XDP eBPF programs fail to verify on Zesty ppc64el (LP: #1699627)
    - [Config] ppc64el: build for Power8 not Power7

  * AACRAID for power9 platform (LP: #1689980)
    - scripts/spelling.txt: add "therfore" pattern and fix typo instances
    - scsi: aacraid: fix PCI error recovery path
    - scsi: aacraid: pci_alloc_consistent() failures on ARM64
    - scsi: aacraid: Remove __GFP_DMA for raw srb memory
    - scsi: aacraid: Fix DMAR issues with iommu=pt
    - scsi: aacraid: Added 32 and 64 queue depth for arc natives
    - scsi: aacraid: Set correct Queue Depth for HBA1000 RAW disks
    - scsi: aacraid: Remove reset support from check_health
    - scsi: aacraid: Change wait time for fib completion
    - scsi: aacraid: Log count info of scsi cmds before reset
    - scsi: aacraid: Print ctrl status before eh reset
    - scsi: aacraid: Using single reset mask for IOP reset
    - scsi: aacraid: Rework IOP reset
    - scsi: aacraid: Add periodic checks to see IOP reset status
    - scsi: aacraid: Rework SOFT reset code
    - scsi: aacraid: Rework aac_src_restart
    - scsi: aacraid: Use correct function to get ctrl health
    - scsi: aacraid: Make sure ioctl returns on controller reset
    - scsi: aacraid: Enable ctrl reset for both hba and arc
    - scsi: aacraid: Add reset debugging statements
    - scsi: aacraid: Remove reference to Series-9
    - scsi: aacraid: Update driver version to 50834

  * arm64 kernel crashdump support (LP: #1694859)
    - memblock: add memblock_clear_nomap()
    - memblock: add memblock_cap_memory_range()
    - arm64: limit memory regions based on DT property, usable-memory-range
    - arm64: kdump: reserve memory for crash dump kernel
    - arm64: mm: add set_memory_valid()
    - arm64: mm: use phys_addr_t instead of unsigned long in __map_memblock
    - arm64: kdump: protect crash dump kernel memory
    - arm64: hibernate: preserve kdump image around hibernation
    - arm64: kdump: implement machine_crash_shutdown()
    - arm64: kdump: add VMCOREINFO's for user-space tools
    - [Config] CONFIG_CRASH_DUMP=y on arm64
    - arm64: kdump: provide /proc/vmcore file
    - Documentation: kdump: describe arm64 port
    - Documentation: dt: chosen properties for arm64 kdump
    - efi/libstub/arm*: Set default address and size cells values for an empty dtb

  * hibmc driver does not include "pci:" prefix in bus ID (LP: #1698700)
    - SAUCE: drm: hibmc: Use set_busid function from drm core

  * Processes in "D" state due to za...

Read more...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers