Update permanent ARP entries for allowed_address_pair IPs in DVR Routers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Won't Fix
|
High
|
Unassigned |
Bug Description
We have a long term issue with Allowed_
The ARP entry for the allowed_
Since DVR does the ARP table update through the control plane, and does not allow any ARP entry to get out of the node to prevent the router IP/MAC from polluting the network, there has been always an issue with this.
A recent patch in master https:/
This patch helped in updating the ARP entry dynamically from the GARP message. But the entry has to be Temporary(NUD - reachable). Only if it is set to 'reachable' we were able to update it on the fly from the GARP message, without using any external tools.
But the problem here is, when we have VMs residing in two different subnets (Subnet A and Subnet B) and if a VM from the Subnet B which is on a different isolated node and is trying to ping the VRRP IP in the Subnet A, the packet from the VM comes to the router namespace where the ARP entry for the VRRP IP is available as reachable. While it is reachable the VM is able to send couple of pings, and later within in 15 sec, the pings timeout.
The reason is that the Router is in turn trying to make sure that if the IP/MAC combination for the VRRP IP is still valid or not, since the entry in the ARP table is "REACHABLE" and not "PERMANENT".
When it tries to re-ARP for the IP, the ARP entries are blocked by the DVR flow rules in the br-tun and so the ARP timesout and the ARP entry in the Router Namespace becomes incomplete.
Option A:
So the way to address this situation is to make use of some GARP sniffer tool/utility that would be running in the router namespace to sniff a GARP packet with a specific IP as a filter. If that IP is seen in the GARP message, the tool/utility should in-turn try to reset the ARP entry for the VRRP IP as permanent. ( This is one option ). This is very performance intensive and so not sure if it would be helpful. So we should probably make it configurable, so that people can use it if required.
Option B:
The other option is, instead of running it on all nodes and in all router-namespace, we can probably just run it on the network_node router_namespace, or in the network node host, and then send a message to the neutron that there was a change in IP/MAC somehow and then neutron will then communicate to all the hosts to do an ARP update for the given IP/MAC. ( Just an idea not sure how simple it is when compared to the former)
Any ideas or thoughts would be helpful.
tags: | added: rfe |
tags: |
added: rfe-triaged removed: rfe |
Changed in neutron: | |
importance: | Undecided → Wishlist |
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
importance: | Wishlist → Critical |
importance: | Critical → High |
tags: | removed: rfe-triaged |
summary: |
- RFE: Update permanent ARP entries for allowed_address_pair IPs in DVR - Routers + Update permanent ARP entries for allowed_address_pair IPs in DVR Routers |
Changed in neutron: | |
assignee: | nobody → Swaminathan Vasudevan (swaminathan-vasudevan) |
Changed in neutron: | |
status: | Confirmed → In Progress |
tags: | added: neutron-proactive-backport-potential |
Changed in neutron: | |
assignee: | Swaminathan Vasudevan (swaminathan-vasudevan) → Brian Haley (brian-haley) |
Changed in neutron: | |
assignee: | Brian Haley (brian-haley) → Slawek Kaplonski (slaweq) |
Changed in neutron: | |
status: | New → In Progress |
I'm not a big fan of a process running that is snooping on traffic, it's most likely going to cause a performance issue.
Can doing this like the keepalived_ state_change code work? It uses "ip monitor" to watch for events and triggers action, and could be modified to look for "neigh" events.