[dvr] bound port permanent arp entries never deleted
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
High
|
Unassigned | ||
Train |
Fix Released
|
High
|
Unassigned | ||
Ussuri |
Fix Released
|
High
|
Unassigned | ||
Victoria |
Fix Released
|
High
|
Unassigned | ||
neutron |
Fix Released
|
High
|
LIU Yulong | ||
neutron (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Unassigned | ||
Groovy |
Fix Released
|
High
|
Unassigned | ||
Hirsute |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
See original bug desription but in short commit b3a42cddc5 removed all the arp management code in favour of using the arp_reponder but missed the fact that DVR floating ips don't use the arp_responder. As a result it was possible to end up with permanent arp entries in qrouter namespaces such that if you created a new port with the same IP as that of a previous port for which there is an arp entry, associating a fip with that port would never be accessible until that arp entry was manually deleted. This patch adds the reverted code back in.
[Test Plan]
* deploy Openstack Train/Ussuri/
* create port P1 with address A1 and create vm on node C1 with this port
* associate floating ip with P1 and ping it
* observe REACHABLE or PERMANENT arp entry for A1 in qrouter arp cache
* delete vm and port
* ensure arp entry for A1 in qrouter arp cache is deleted
* create port P2 with address A1 and create vm on node C1 with this port
* associate floating ip with P2 and ping it
[Where problems could occur]
No problems anticipated from re-introducing this code. Of course this code uses RPC notifications and as a result will incur some extra amqp load but is not anticipated to be a problem and it was not considered a problem when the code existed prior to removal.
-------
With Openstack Ussuri using dvr-snat I do the following:
* create port P1 with address A1 and create vm on node C1 with this port
* associate floating ip with P1 and ping it
* observe REACHABLE arp entry for A1 in qrouter arp cache
* so far so good
* restart the neutron-l3-agent
* observe REACHABLE arp entry for A1 is now PERMANENT
* delete vm and port
* create port P2 with address A1 and create vm on node C1 with this port
* vm is unreachable since arp cache contains PERMANENT entry for old port P1 mac/ip combo
If I don't restart the l3-agent, once I have deleted the port it's arp entry does REACHABLE -> STALE and will either be replaced or timeout as expected but once it is set to PERMANENT it will never disappear which means any future use of that ip address (by a port with a different mac) will not work until that entry is manually deleted.
tags: | added: l3-dvr-backlog |
Changed in neutron: | |
status: | New → In Progress |
assignee: | LIU Yulong (dragon889) → Edward Hope-Morley (hopem) |
Changed in neutron: | |
assignee: | Edward Hope-Morley (hopem) → LIU Yulong (dragon889) |
Changed in neutron (Ubuntu Focal): | |
status: | New → Triaged |
Changed in neutron (Ubuntu Hirsute): | |
status: | New → Triaged |
Changed in neutron (Ubuntu Groovy): | |
importance: | Undecided → High |
Changed in neutron (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in neutron (Ubuntu Groovy): | |
status: | New → Triaged |
Changed in cloud-archive: | |
status: | Triaged → Fix Committed |
description: | updated |
Changed in neutron: | |
status: | In Progress → Fix Released |
description: | updated |
tags: |
added: verification-done-focal removed: verification-needed-focal |
tags: |
added: verification-ussuri-needed removed: verification-ussuri-done |
tags: |
added: verification-done removed: verification-needed |
Confirmed, I can reproduce the behavior of PERMANENT arp even the agent is not restarted.