Improve HAProxy health checks by using HTTP instead of TCP
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla |
New
|
Undecided
|
Unassigned |
Bug Description
Hello,
I've been facing intermittent issues with backend services behind HAProxy, especially with Keystone. From time to time, Keystone throws a 503 error due to WSGI issues like "After multiple attempts as listener backlog limit was exceeded or the socket does not exist." I don't want to focus on that specific Keystone problem here, but rather on how HAProxy handles health checks for HTTP services like Keystone.
By default, HAProxy checks Keystone using a TCP connection. This means it only verifies that the port is open, but it doesn’t detect whether the service itself is healthy. For example, when Keystone returns a 503, HAProxy continues to route traffic to the failing instance because the TCP connection is still alive. This leads to misrouted traffic and service instability.
To address this, I applied a temporary fix by modifying the Keystone HAProxy default vars in Kolla-Ansible to use HTTP-level health checks instead of TCP. Here’s the change I made:
keystone_services:
keystone:
haproxy:
keystone_
- "option httpchk"
- "http-check send meth GET"
- "{{ 'balance source' if enable_
After making this change, HAProxy properly detects when Keystone returns a 503 and marks the backend as down, which prevents traffic from being routed to the failing instance. This has significantly improved service stability.
I'm wondering whether it would make sense to switch the default HAProxy health checks for Keystone (and possibly other HTTP services) from TCP to HTTP in Kolla-Ansible. HTTP checks provide more accurate failure detection and could improve overall availability and reliability. What are your thoughts on this?
tags: | added: haproxy |
tags: | added: keystone |