Comment 2 for bug 1926531

Revision history for this message
Arjun Baindur (abaindur) wrote :

Liu, this is not for an HA router. Also, it is not for centralized FIPs.

1. This is a compute node, where l3_agent is in dvr_snat mode. We have multiple such nodes where l3-agent is in dvr_snat mode for regular failover

2. Router is a regular DVR router, not HA. We have no centralized FIPs.

3. There are VMs on the same node with and without floating IPs.

So to reproduce, have 2 or more nodes in DVR SNAT mode for l3-agent. These should also be compute nodes, so nova-compute, etc... is on same.

Create a DVR but non-HA router, so that one snat namespace gets scheduled to one of the 2+ nodes. Create a VM and some Floating IPs on each node, so qrouter namespace is created, fip namespace is created, and rfp/fpr link is created on all nodes.

At this point, snat has been scheduled to one of these dvr_snat nodes as well.

Now, restart l3-agent on one of the OTHER nodes.

You will see on init snat namespace gets created on these nodes, then deleted again in the code paths I listed before. The deletion code triggers deletion of gateway which ends up deleting rfp/fpr link between qrouter and FIP.

Prior to the fix, snat was not created then deleted on dvr_snat nodes that did not host snat router