Bug #1441363 “nf_conntrack schould be unloaded on swift object s...” : Bugs : OpenStack-Ansible

Revision history for this message

Bjoern (bjoern-t) wrote on 2015-04-07:

#1

This also implies that we either turn off lxc-net (which adds iptables rules for non existing containers) or removing lxc altogether.

Revision history for this message

Evan Callicoat (diopter) wrote on 2015-04-14:

#2

I strongly disagree with the first part of this recommendation. Disabling conntrack in modern Linux is essentially an old hack devised for old Linux to improve performance on extremely busy servers which are dedicated to a single, dumb task from a networking perspective. It's not necessary and definitely not the Right™ way to solve the problem.

I also dislike the idea of removing conntrack/lxc* as it reduces the deployment and expansion flexibility of an object server, which is definitely not aligned with the goals of this project architecture.

You have a few simple options here:

A) If you have lots of TW connections lingering and you're running into the nf_conntrack_max or running out of available sockets, set the tcp_tw_reuse sysctl (as suggested in the second part of this recommendation) in order to allow the kernel to reclaim sockets in TW when it's safe to do so per the protocol. I should note that this is *only* safe to do so when conntrack *is* enabled, because a reused TW port-pair has the potential to still receive late traffic and conntrack is the mechanism which recognizes that traffic as INVALID and drops it appropriately. Not having conntrack on while also reusing TWs can cause actual data crossover from old, closing connections into new sockets, eventually.

B) Up the conntrack limit

C) Set the tcp_tw_recycle sysctl, if you *really* need aggressive connection closing. This can confuse or utterly break the remote ends of connections, so it's not recommended unless these boxes are so incredibly busy they're utterly buckling under the speed of connections/sec, in which case this still really isn't the right solution -- the right solution is to load-balance and scale the architecture appropriately.

My recommendation is A, and B if also necessary if-and-only-if there are enough valid, active connections that even reusing TW sockets hits the conntrack limit. I strongly advise against C in almost any circumstance, and certainly in this one.

I strongly disagree with the first part of this recommendation. Disabling conntrack in modern Linux is essentially an old hack devised for old Linux to improve performance on extremely busy servers which are dedicated to a single, dumb task from a networking perspective. It's not necessary and definitely not the Right™ way to solve the problem.

I also dislike the idea of removing conntrack/lxc* as it reduces the deployment and expansion flexibility of an object server, which is definitely not aligned with the goals of this project architecture.

You have a few simple options here:

A) If you have lots of TW connections lingering and you're running into the nf_conntrack_max or running out of available sockets, set the tcp_tw_reuse sysctl (as suggested in the second part of this recommendation) in order to allow the kernel to reclaim sockets in TW when it's safe to do so per the protocol. I should note that this is *only* safe to do so when conntrack *is* enabled, because a reused TW port-pair has the potential to still receive late traffic and conntrack is the mechanism which recognizes that traffic as INVALID and drops it appropriately. Not having conntrack on while also reusing TWs can cause actual data crossover from old, closing connections into new sockets, eventually.

B) Up the conntrack limit

C) Set the tcp_tw_recycle sysctl, if you *really* need aggressive connection closing. This can confuse or utterly break the remote ends of connections, so it's not recommended unless these boxes are so incredibly busy they're utterly buckling under the speed of connections/sec, in which case this still really isn't the right solution -- the right solution is to load-balance and scale the architecture appropriately.

My recommendation is A, and B if also necessary if-and-only-if there are enough valid, active connections that even reusing TW sockets hits the conntrack limit. I strongly advise against C in almost any circumstance, and certainly in this one.

Revision history for this message

Bjoern (bjoern-t) wrote on 2015-04-14:

#3

Thanks for commenting. In terms of TW connection, I'm pretty confident that we don't have to fear out of order packets in a local network. I'm sure that we can intermittently fix this issue with upping the conntrack limit, but once it's hit already in a idle environment I fear we are going to hit it even more once it's used. Speaking of the environment, we have running RPC10 swift environments which do not have conntrack enabled so I find it really disturbing that we suddenly have one environment running with connection tracking. Which seemed to have been enabled accidentally by installing LXC, at least that was the difference compared to other environments since LXC includes lxc-net which needs iptables.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-24: Fix proposed to os-ansible-deployment (juno)

#4

Fix proposed to branch: juno
Review: https://review.openstack.org/177163

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-24: Fix proposed to os-ansible-deployment (kilo)

#5

Fix proposed to branch: kilo
Review: https://review.openstack.org/177172

Revision history for this message

Andy McCrae (andrew-mccrae) wrote on 2015-04-24:

#6

I've added PRs following Evan's suggestions - reviews welcome!

Revision history for this message

Darren Birkett (darren-birkett) wrote on 2015-04-24:

#7

Fix proposed to branch: master
review: https://review.openstack.org/#/c/177160

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-24: Fix merged to os-ansible-deployment (juno)

#8

Reviewed: https://review.openstack.org/177163
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=c9f9a0623fb22fd17e573f4c8e8b139a98aa260c
Submitter: Jenkins
Branch: juno

commit c9f9a0623fb22fd17e573f4c8e8b139a98aa260c
Author: Andy McCrae <email address hidden>
Date: Fri Apr 24 11:45:07 2015 +0100

Set tcp_tw_reuse for swift storage hosts

    For swift storage hosts we are seeing a lot of connections in TIME WAIT
    status, violating nf_conntrack. Setting tcp_tw_reuse should help
    alleviate this.

Additionally, in order for tcp_tw_reuse to be set safely we need to
ensure nf_conntrack is loaded.

Change-Id: I4392c4022a9a5a884d07eb6fbf27093f0b16f914
Closes-Bug: #1441363

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-24: Fix merged to os-ansible-deployment (master)

#9

Reviewed: https://review.openstack.org/177160
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=b206e434a350941ed329ba4e84940afc722f3557
Submitter: Jenkins
Branch: master

commit b206e434a350941ed329ba4e84940afc722f3557
Author: Andy McCrae <email address hidden>
Date: Fri Apr 24 11:34:02 2015 +0100

Set tcp_tw_reuse for swift storage hosts

    For swift storage hosts we are seeing a lot of connections in TIME WAIT
    status, violating nf_conntrack. Setting tcp_tw_reuse should help
    alleviate this.

Additionally, in order for tcp_tw_reuse to be set safely we need to
ensure nf_conntrack is loaded.

Change-Id: I4392c4022a9a5a884d07eb6fbf27093f0b16f914
Closes-Bug: #1441363

Changed in openstack-ansible:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-24: Fix merged to os-ansible-deployment (kilo)

#10

Reviewed: https://review.openstack.org/177172
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=66cbe687e6fd8a4b9ded9945f9a08395e7932a43
Submitter: Jenkins
Branch: kilo

commit 66cbe687e6fd8a4b9ded9945f9a08395e7932a43
Author: Andy McCrae <email address hidden>
Date: Fri Apr 24 11:34:02 2015 +0100

Set tcp_tw_reuse for swift storage hosts

    For swift storage hosts we are seeing a lot of connections in TIME WAIT
    status, violating nf_conntrack. Setting tcp_tw_reuse should help
    alleviate this.

Additionally, in order for tcp_tw_reuse to be set safely we need to
ensure nf_conntrack is loaded.

    Change-Id: I4392c4022a9a5a884d07eb6fbf27093f0b16f914
    Closes-Bug: #1441363
    (cherry picked from commit b4c09dbd6e4d7d60c4f99469a82d093900ab8aa2)

Revision history for this message

Davanum Srinivas (DIMS) (dims-v) wrote on 2016-03-18: Fix included in openstack/openstack-ansible 11.2.11

#11

This issue was fixed in the openstack/openstack-ansible 11.2.11 release.

Revision history for this message

Doug Hellmann (doug-hellmann) wrote on 2016-03-25: Fix included in openstack/openstack-ansible 11.2.12

#12

This issue was fixed in the openstack/openstack-ansible 11.2.12 release.

Revision history for this message

Davanum Srinivas (DIMS) (dims-v) wrote on 2016-05-03: Fix included in openstack/openstack-ansible 11.2.14

#13

This issue was fixed in the openstack/openstack-ansible 11.2.14 release.

	Status	Importance	Assigned to	Milestone
OpenStack-Ansible	Fix Released	Medium	Andy McCrae
Juno	Fix Released	Medium	Andy McCrae	OpenStack-Ansible 10.1.4
Kilo	Fix Released	Medium	Andy McCrae	OpenStack-Ansible 11.0.0
Trunk	Fix Released	Medium	Andy McCrae

OpenStack-Ansible

nf_conntrack schould be unloaded on swift object server

Bug Description

Other bug subscribers

Remote bug watches