Comment 0 for bug 1385234

Revision history for this message
Giulio Fidente (gfidente) wrote : OVS tunneling between multiple neutron nodes breaks if amqp is restarted

At completion of a deployment with multiple controllers, by observing the gre tunnels created in OVS by the neutron ovs-agent, one will find that some neutron nodes may miss the tunnels in between them.

This is due to ovs-agents getting disconnected from the rabbit cluster without them noticing and as a result, being unable to receive updates from other nodes or publish updates.

The disconnection may happen following a reconfig of a rabbit node, the VIP moving over a different node, or even _during_ deployment due to rabbit cluster configuration.

Use of some aggressive (low) kernel keepalive probes interval seems to improve the reliability but a more appropriate fix seems to be support for heartbeat in oslo.messaging