Comment 19 for bug 856764

Revision history for this message
Vish Ishaya (vishvananda) wrote :

The following fix works for failover, but doesn't solve all of the problems in HA mode. For that kevin's patch above is needed.

When a connection to a socket is cut off completely, the receiving side doesn't know that the connection has dropped, so can end up with a half-open connection. The general solution for this in linux is to turn on TCP_KEEPALIVES. Kombu will enable keepalives if the version number is high enough (>1.0 iirc), but rabbit needs to be specially configured to send keepalives on the connections that it creates.

So solving the HA issue generally involves a rabbit config with a section like the following:

[
 {rabbit, [{tcp_listen_options, [binary,
                                {packet, raw},
                                {reuseaddr, true},
                                {backlog, 128},
                                {nodelay, true},
                                {exit_on_close, false},
                                {keepalive, true}]}
          ]}
].

Then you should also shorten the keepalive sysctl settings or it will still take ~2 hrs to terminate the connections:

echo "5" > /proc/sys/net/ipv4/tcp_keepalive_time
echo "5" > /proc/sys/net/ipv4/tcp_keepalive_probes
echo "1" > /proc/sys/net/ipv4/tcp_keepalive_intvl

Obviously this should be done in a sysctl config file instead of at the command line. Note that if you only want to shorten the rabbit keepalives but keep everything else as a default, you can use an LD_PRELOAD library to do so. For example you could use:

https://github.com/meebey/force_bind/blob/master/README