Comment 6 for bug 1880610

Revision history for this message
Jose Guedez (jfguedez) wrote :

We had another event when having the L7 checks would have been helpful. Since the balancing scheme is "leastconn" [0], having a bad backend that is not completely dead (i.e. accepts network connections, but just never replies until a timeout is reached), will lead to the situation of new connections to the loadbalancer being pointed to a "broken" backend.

We have recently seen this in production clouds using the gnocchi charm, but should be similar for other services (principal charms). In this particular case the requests were being continuously pointed to the unresponsive backend, even though the others were healthy (just 1 out of 3 backends was unhealthy). Since the unhealthy gnocchi backend service was still accepting network connections, haproxy continued sending it requests. Pausing the bad backend restored service to clients while the problem with the unhealthy backend was resolved.

Having a L7 check would make haproxy realise that a backend is unhealthy quickly and requests would be then be routed to the other 2 healthy backends. This would be similar to how readiness checks in Kubernetes (typically L7), would prevent this situation from happening.

[0] https://github.com/juju/charm-helpers/blob/master/charmhelpers/contrib/openstack/templates/haproxy.cfg#L79