In one of the cloud environment, the FIP attached to the Octavia Loadbalancer IP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not updated with snat ip in non-working cases.
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer IP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps subnet/ router is created, network attached to router subnet/ router
1. network/
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places [1] [2] but get_snat_ interfaces( ) and get_ports_ by_subnet( ) are not updated with snat ip in non-working cases.
[1] https:/ /opendev. org/openstack/ neutron/ src/commit/ dfd04115b059c22 63cdd8ac44ccc2e c47614bcc3/ neutron/ agent/l3/ dvr_local_ router. py#L570 /opendev. org/openstack/ neutron/ src/commit/ dfd04115b059c22 63cdd8ac44ccc2e c47614bcc3/ neutron/ agent/l3/ dvr_local_ router. py#L317
[2] https:/