nf_conntrack releases a conntrack with non-zero refcnt

Bug #1466135 reported by Chris J Arges
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Fix Released
Medium
Chris J Arges

Bug Description

[Impact]
Occasionally starting new containers or creating new net namespaces may soft lockup because of improper refcounting of conntrack entires.

In the issue that I face, I can find a kworker thread using up an entire core, and when I cat /proc/$pid/stack I see this:

<ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0
[<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack]
[<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170
[nf_conntrack]
[<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack]
[<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0
[<ffffffffbe084991>] process_one_work+0x171/0x470
[<ffffffffbe08563b>] worker_thread+0x11b/0x3a0
[<ffffffffbe08bb82>] kthread+0xd2/0xf0
[<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff

The kworker is looping forever and failing to clean up conntrack state.
All the while, it holds the global netns lock. Given that I've bisected
to commit e53376bef2cd97d3e3f61fdc677fb8da7d03d0da which is to do with refcounting, I suspect that borked refcounting on conntrack entries makes them impossible to properly free/destroy, which prevents this worker from cleaning up the namespace, which then goes on to prevent anything else from interacting with namespaces (add/delete/etc).

[Test Case]
bug 1403152 has a testcase which can occasionally hit this issue

[Fix]
$ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da
v3.14-rc3~36^2~28^2~12

CVE References

Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Fix Released
importance: Medium → Undecided
Chris J Arges (arges)
description: updated
Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

SRU Patch sent to Ubuntu kernel-team ML.

Brad Figg (brad-figg)
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Joe Stringer (joestringer) wrote :

Apologies for the delay on this, I've been travelling. Thanks Chris & others for following up. We've locally cherry-picked this patch and confirmed it fixes our issue, happy to test again with an official deb if someone can point me at that.

Local reproduction instructions:
- Install Ubuntu 14.04.[01] (kernel 3.13.0-40-generic)
- Get docker image that includes OVS dependencies
- Build openvswitch from https://github.com/justinpettit/ovs/tree/conntrack
- Instructions to build here: https://github.com/justinpettit/ovs/blob/conntrack/INSTALL.Debian.md
- Install and load openvswitch module on host. (dpkg -i *.deb, modprobe openvswitch)

In one shell:
# ip addr add dev docker0 192.168.0.2/24; ping 192.168.0.1
(leave running)

In another shell, assumes $PWD contains openvswitch debs and repro script from below:
$ docker run -i -t --entrypoint=bash --privileged=true -v $PWD:/host <docker image with OVS deps>
$ cd /host; ./repro.sh 192.168.0.1
(wait until first shell shows that pings are flowing)
$ ovs-ofctl dump-flows br0
(should show two flows, each which are getting traffic. One has actions=ct(commit,recirc))
$ conntrack -L
(Optional; can see the ICMP connection listed)

Now:
- Press Ctrl+D to exit the container. It is a little slow to exit.
- Subsequent container starts or "ip netns add foo" will hang.

$ cat repro.sh
#!/bin/bash

IP=$1

cd /host
dpkg -i openvswitch-common*deb openvswitch-switch*deb

service openvswitch-switch restart
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0

ip link set dev br0 up
ip addr add dev br0 $IP/24
ip addr

ovs-ofctl add-flow br0 "conn_state=-trk,ip actions=ct(commit,recirc)"

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Joe Stringer (joestringer) wrote :

I confirmed that the bug is no longer peresent when using linux-image-3.13.0-58-generic_3.13.0-58.97_amd64 from trusty-proposed, using the reproduction instructions from #2.

Thanks!

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.8 KiB)

This bug was fixed in the package linux - 3.13.0-58.97

---------------
linux (3.13.0-58.97) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1472453

  [ Upstream Kernel Changes ]

  * vm: Fix incomplete backport of VM_FAULT_SIGSEGV handling support
    - LP: #1471892

linux (3.13.0-58.96) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1471991

  [ Iyappan Subramanian ]

  * SAUCE: (no-up): drivers: net: xgene: fix: Out of order descriptor bytes
    read
    - LP: #1425576

  [ Upstream Kernel Changes ]

  * NVMe: Add shutdown timeout as module parameter.
    - LP: #1465136
  * Drivers: hv: vmbus: Add support for VMBus panic notifier handler
    - LP: #1463584
  * Drivers: hv: vmbus: Correcting truncation error for constant
    HV_CRASH_CTL_CRASH_NOTIFY
    - LP: #1463584
  * netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
    - LP: #1466135
  * lpfc: Add iotag memory barrier
    - LP: #1468416
  * mm/slab_common: support the slub_debug boot option on specific object
    size
    - LP: #1456952
  * pipe: iovec: Fix memory corruption when retrying atomic copy as
    non-atomic
    - CVE-2015-1805
  * kvm: x86: fix kvm_apic_has_events to check for NULL pointer
  * staging, rtl8192e, LLVMLinux: Change extern inline to static inline
    - LP: #1471233
  * kernel: use the gnu89 standard explicitly
    - LP: #1471233
  * staging, rtl8192e, LLVMLinux: Remove unused inline prototype
    - LP: #1471233
  * staging: rtl8712, rtl8712: avoid lots of build warnings
    - LP: #1471233
  * qla2xxx: remove redundant declaration in 'qla_gbl.h'
    - LP: #1471233
  * staging: wlags49_h2: fix extern inline functions
    - LP: #1471233
  * ARM: 8307/1: psci: move psci firmware calls out of line
    - LP: #1471233
  * kconfig: Fix warning "‘jump’ may be used uninitialized"
    - LP: #1471233
  * scripts/sortextable: suppress warning: `relocs_size' may be used
    uninitialized
    - LP: #1471233
  * ASoC: dapm: Enable autodisable on SOC_DAPM_SINGLE_TLV_AUTODISABLE
    - LP: #1471233
  * ALSA: hda - Fix mute-LED fixed mode
    - LP: #1471233
  * ALSA: emu10k1: Fix card shortname string buffer overflow
    - LP: #1471233
  * ALSA: emux: Fix mutex deadlock at unloading
    - LP: #1471233
  * drm/radeon: add SI DPM quirk for Sapphire R9 270 Dual-X 2G GDDR5
    - LP: #1471233
  * SCSI: add 1024 max sectors black list flag
    - LP: #1471233
  * 3w-sas: fix command completion race
    - LP: #1471233
  * 3w-xxxx: fix command completion race
    - LP: #1471233
  * 3w-9xxx: fix command completion race
    - LP: #1471233
  * serial: xilinx: Use platform_get_irq to get irq description structure
    - LP: #1471233
  * serial: of-serial: Remove device_type = "serial" registration
    - LP: #1471233
  * tty/serial: at91: maxburst was missing for dma transfers
    - LP: #1471233
  * ALSA: emux: Fix mutex deadlock in OSS emulation
    - LP: #1471233
  * ALSA: emu10k1: Emu10k2 32 bit DMA mode
    - LP: #1471233
  * rbd: end I/O the entire obj_request on error
    - LP: #1471233
  * powerpc/pseries: Correct cpu affinity for dlpar added cpus
    - LP: #1471233
  * bridge/mdb: remove wrong use of NLM_F_MULT...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.