FFE: HAproxy dropping connections (RST) during config reload / support seamless reload

Bug #1712925 reported by Dave Chiluk on 2017-08-24
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
haproxy (Ubuntu)
Undecided
Dave Chiluk

Bug Description

Reloading haproxy causes TCP resets for active connections. This can be a serious issue for clouds that rely on haproxy for load balancing, and as a result are restarting it frequently.

Full related blog post is here.
https://www.haproxy.com/blog/truly-seamless-reloads-with-haproxy-no-more-hacks/

FFE Justification
- Description- The patchset fixes the issue by adding the -x option to haproxy. This option is used for passing the unix stats socket from the old haproxy to the new one. The old haproxy then passes connections to the new haproxy using this socket *(simplified explanation). The changes are largely isolated to new functions that implement this functionality.

- Rationale - The change is largely isolated to the new option, but for those running clouds this could be potentially very important. Clouds that are "doing it right", and treating instances as cattle are constantly tearing down and rebuilding instances. This has the side effect of constantly reloading haproxy. For example at Indeed on a few of our clouds haproxy gets restarted roughly every second. My tests show that this causes a connection reset rate of about 18 resets for 50k connections. The haproxy teams are showing 11 tcp resets for 2k connections. Either way it's greater than 0 and it's dependent on how many connections you receive and how fast you are restarting haproxy. I've chosen not to enable this by default in the systemd unit files, as enabling that the stats socket in the haproxy config match the one passed with the -x command. However for those that are seeing this problem only having to make the config and unit file changes should be a better user experience than hand building packages.

Configuration
1. Add the following lines to your haproxy.cfg
" # turn on stats unix socket
        stats socket /var/lib/haproxy/stats expose-fd listeners
        stats bind-process 1
"
2. Add "HAPROXY_STATS_SOCKET=/var/lib/haproxy/stats" to the Environment variable in /lib/systemd/system/haproxy.service
3. $ sudo systemctl daemon-reload
4. $ systemctl reload haproxy.
  It's important to use reload as it is accomplished using the haproxy-systemd-wrapper.

Testing.
1. Configure as above.
2. put haproxy in a reload loop
 $ while true ; do sudo systemctl reload haproxy ; sleep 3 ; done
3. run apache bench against it.
 $ ab -r -c 20 -n 100000 http://192.168.1.226:8070/

Results:
With these patches:
Complete requests: 100000
Failed requests: 0

Without these patches:
Complete requests: 100000
Failed requests: 81
   (Connect: 0, Receive: 27, Length: 27, Exceptions: 27)

Dave Chiluk (chiluk) wrote :
Dave Chiluk (chiluk) wrote :

I have created a ppa with this current patch to assist with testing.

https://launchpad.net/~chiluk/+archive/ubuntu/lp1712925

This patchset is still in the early stages, so use/test this with extreme caution.

Dave Chiluk (chiluk) on 2017-08-24
description: updated

Hi Dave,
is that intended to go towards 18.04 or into 17.10.
Because for 17.10 we are now in Feature Freeze, so if you want to drive that through to 17.10 I'd recommend making this an FFE bug.

Dave Chiluk (chiluk) wrote :

The original intention was to get it into 17.10, but I didn't have enough time yesterday to get it tested and cleaned up. I really think we should consider this for FFE, as it is a bug fix. That being said it also is an interface change, and will probably require changes to the systemd unit files as well *(still have to work that bit out).

summary: - HAproxy dropping connections (RST) during config reload / support
+ FFE: HAproxy dropping connections (RST) during config reload / support
seamless reload
Dave Chiluk (chiluk) wrote :

Another thing just occurred to me. We probably want the kinks in this worked out in the 17.10 development and release cycle, as 18.04 will hopefully get haproxy 1.8 which will include these changes officially.

Dave Chiluk (chiluk) on 2017-09-13
description: updated
Steve Langasek (vorlon) wrote :

FFe granted.

Changed in haproxy (Ubuntu):
status: New → Triaged
Dave Chiluk (chiluk) wrote :

Added configuration and testing results to top page. Upload commencing shortly.

description: updated
Dave Chiluk (chiluk) on 2017-09-14
Changed in haproxy (Ubuntu):
status: Triaged → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package haproxy - 1.7.9-1ubuntu1

---------------
haproxy (1.7.9-1ubuntu1) artful; urgency=medium

  * Backport of -x option from upstream haproxy to enable seamless
    reloading of haproxy without dropping connections. This is enabled
    by adding
    " stats socket <stats file> expose-fd listeners
      stats bind-process 1 "
    to the global section of your haproxy config, and
    setting HAPROXY_STATS_SOCKET in the haproxy.service unit file.
    (LP: #1712925)

 -- Dave Chiluk <email address hidden> Thu, 14 Sep 2017 12:32:36 -0500

Changed in haproxy (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments