Large number of TIME_WAIT connections from haproxy cause stalled API service requests

Bug #1413104 reported by Dmitry Borodaenko
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Bartłomiej Piotrowski
5.0.x
Fix Committed
Critical
Bartłomiej Piotrowski
5.1.x
Fix Committed
Critical
Bartłomiej Piotrowski
6.0.x
Fix Committed
Critical
Bartłomiej Piotrowski
6.1.x
Fix Released
Critical
Bartłomiej Piotrowski

Bug Description

At load, haproxy begins accumulating a large number of connections to backend services stuck in TIME_WAIT state. Eventually, this leads to stalled requests to OpenStack API services.

Setting the http-server-close option:
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4-option%20http-server-close

makes the problem disappear.

Revision history for this message
Dmitriy Novakovskiy (dnovakovskiy) wrote :

 Should the http-server-close option be propagated to customers experiencing the issue? I know 1 for sure

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The http-server-close is enabled by default according to the docs. Are you sure enabling it explicitly would help?

Changed in fuel:
status: New → Triaged
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

My mistake, this option looks like disabled by defaults and we may want it enabled both in a frontend and in a backend indeed

tags: added: low-hanging-fruit
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/149963

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Bartlomiej Piotrowski (bpiotrowski)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/149963
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=6c6bfc35f779774141274579b948280518f57e2d
Submitter: Jenkins
Branch: master

commit 6c6bfc35f779774141274579b948280518f57e2d
Author: Bartłomiej Piotrowski <email address hidden>
Date: Mon Jan 26 11:06:08 2015 +0100

    Use haproxy's http-server-close by default

    At load, haproxy begins accumulating a large number of connections to
    backend services stuck in TIME_WAIT state. Eventually, this leads to
    stalled requests to OpenStack API services. http-server-close solves
    that problem.

    Change-Id: Ie405fe2d384d9d8375f60ad2c3c9885f41fe6d49
    Closes-Bug: 1413104

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Mike Scherbakov (mihgen) wrote :

Don't we want to backport this to older branches?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/6.0)

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/152059

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.1)

Fix proposed to branch: stable/5.1
Review: https://review.openstack.org/152060

Revision history for this message
Bartłomiej Piotrowski (bpiotrowski) wrote :

We do, I completely forgot about it.

Revision history for this message
Bartłomiej Piotrowski (bpiotrowski) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/6.0)

Reviewed: https://review.openstack.org/152059
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=f41f594850fb8ced89a2a29649cd478492d2b29b
Submitter: Jenkins
Branch: stable/6.0

commit f41f594850fb8ced89a2a29649cd478492d2b29b
Author: Bartłomiej Piotrowski <email address hidden>
Date: Mon Jan 26 11:06:08 2015 +0100

    Use haproxy's http-server-close by default

    At load, haproxy begins accumulating a large number of connections to
    backend services stuck in TIME_WAIT state. Eventually, this leads to
    stalled requests to OpenStack API services. http-server-close solves
    that problem.

    Change-Id: Ie405fe2d384d9d8375f60ad2c3c9885f41fe6d49
    Closes-Bug: 1413104
    (cherry picked from commit 6c6bfc35f779774141274579b948280518f57e2d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/152060
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=a8417ef811919dbe7abee90eeb67f4ca7ede2935
Submitter: Jenkins
Branch: stable/5.1

commit a8417ef811919dbe7abee90eeb67f4ca7ede2935
Author: Bartłomiej Piotrowski <email address hidden>
Date: Mon Jan 26 11:06:08 2015 +0100

    Use haproxy's http-server-close by default

    At load, haproxy begins accumulating a large number of connections to
    backend services stuck in TIME_WAIT state. Eventually, this leads to
    stalled requests to OpenStack API services. http-server-close solves
    that problem.

    Change-Id: Ie405fe2d384d9d8375f60ad2c3c9885f41fe6d49
    Closes-Bug: 1413104
    (cherry picked from commit 6c6bfc35f779774141274579b948280518f57e2d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.0)

Fix proposed to branch: stable/5.0
Review: https://review.openstack.org/156483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.0)

Reviewed: https://review.openstack.org/156483
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=cbaae5b5fd0a6c7edc562f6dcc9d48ca568bb600
Submitter: Jenkins
Branch: stable/5.0

commit cbaae5b5fd0a6c7edc562f6dcc9d48ca568bb600
Author: Bartłomiej Piotrowski <email address hidden>
Date: Mon Jan 26 11:06:08 2015 +0100

    Use haproxy's http-server-close by default

    At load, haproxy begins accumulating a large number of connections to
    backend services stuck in TIME_WAIT state. Eventually, this leads to
    stalled requests to OpenStack API services. http-server-close solves
    that problem.

    Change-Id: Ie405fe2d384d9d8375f60ad2c3c9885f41fe6d49
    Closes-Bug: 1413104
    (cherry picked from commit 6c6bfc35f779774141274579b948280518f57e2d)
    (cherry picked from commit a8417ef811919dbe7abee90eeb67f4ca7ede2935)

Revision history for this message
Dina Belova (dbelova) wrote :

Still needs to be verified on the scale lab.

tags: added: on-verification
Revision history for this message
Sergey Novikov (snovikov) wrote :

Verified on fuel-6.1-445-2015-05-20_22-10-04.iso.

Steps to verify:
    1. Deploy cluster (HA, Neutron+VLAN, 3 controllers + 2 compute)
    2. Run OSTF tests
    3. Check haproxy config file - option 'http-server-close' is setted
    4. Simulate load at haproxy (a large number of request to some Openstack service in multiple threads)
    5. Check haproxy status (status is shown on <public_vip_ip>:10000 by default)

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.