Improve HAProxy health checks by using HTTP instead of TCP

Bug #2103559 reported by Jacolex
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
New
Undecided
Unassigned

Bug Description

Hello,

I've been facing intermittent issues with backend services behind HAProxy, especially with Keystone. From time to time, Keystone throws a 503 error due to WSGI issues like "After multiple attempts as listener backlog limit was exceeded or the socket does not exist." I don't want to focus on that specific Keystone problem here, but rather on how HAProxy handles health checks for HTTP services like Keystone.

By default, HAProxy checks Keystone using a TCP connection. This means it only verifies that the port is open, but it doesn’t detect whether the service itself is healthy. For example, when Keystone returns a 503, HAProxy continues to route traffic to the failing instance because the TCP connection is still alive. This leads to misrouted traffic and service instability.

To address this, I applied a temporary fix by modifying the Keystone HAProxy default vars in Kolla-Ansible to use HTTP-level health checks instead of TCP. Here’s the change I made:

keystone_services:
  keystone:
    haproxy:
      keystone_internal:
        backend_http_extra:
          - "option httpchk"
          - "http-check send meth GET"
          - "{{ 'balance source' if enable_keystone_federation | bool else '' }}"

After making this change, HAProxy properly detects when Keystone returns a 503 and marks the backend as down, which prevents traffic from being routed to the failing instance. This has significantly improved service stability.

I'm wondering whether it would make sense to switch the default HAProxy health checks for Keystone (and possibly other HTTP services) from TCP to HTTP in Kolla-Ansible. HTTP checks provide more accurate failure detection and could improve overall availability and reliability. What are your thoughts on this?

Jacolex (jacolex)
tags: added: haproxy
tags: added: keystone
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.