bgpvpn router fallback broken by change in neutron openvswitch firewall
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
BaGPipe |
Fix Released
|
High
|
Thomas Morin | ||
networking-bgpvpn |
Fix Released
|
High
|
Thomas Morin | ||
neutron |
Fix Released
|
High
|
Nguyen Phuong An |
Bug Description
This issue impacts current master, stable/rocky and stable/queens.
The first symptom is that we have seen failures of many tests from legacy-
Background:
networking-bagpipe code for BGPVPN has a "router fallback" mechanism: in cases where a network is at the same time connected to a Router and associated to a BGPVPN, the traffic sent by a VM to its gateway is redirected to br-mpls to attempt BGPVPN route matching, before eventually being sent, as a fallback, to the neutron netns router if it did no VPN route was matched in br-mpls.
For this mechanism to work, a rule is in place in table 91 to override the NORMAL action (which would result in flood/learn) for the traffic destinated to the gateway MAC address, with a higher priority rule that sends the traffic to br-tun instead (br-tun is where the redirection to br-mpls takes place):
cookie=
cookie=
(above, fa:16:3e:c5:89:72 is the gateway MAC address for the network with vlan_id 24)
Analysis of the issue:
Change [1] modified some rules that were resubmiting to table 91, to instaead use a NORMAL action, resulting in only the first packets (from a conntrack standpoint) to reach table 91.
This prevents the redirection of traffic to br-tun,br-mpls.
The tricky thing is that the issue does not always occurs: when there is no entry in the MAC leaning table (ovs-appctl fdb/show br-int) for the gateway MAC, the traffic is flooded and eventually reaches br-tun,br-mpls . This explains why some tests, but not all tests, fail.
(not also that the tests where no Router is used in the destination network do not seem to fail.)
[1] https:/
Changed in neutron: | |
assignee: | nobody → Nguyen Phuong An (annp) |
description: | updated |
tags: | added: ovs-fw |
Changed in neutron: | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in bgpvpn: | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in neutron: | |
assignee: | Nguyen Phuong An (annp) → Thomas Morin (tmmorin-orange) |
Changed in networking-bagpipe: | |
status: | Confirmed → In Progress |
Changed in neutron: | |
assignee: | Thomas Morin (tmmorin-orange) → Nguyen Phuong An (annp) |
tags: | added: neutron-proactive-backport-potential |
Changed in neutron: | |
status: | In Progress → Fix Released |
Changed in bgpvpn: | |
status: | Confirmed → Fix Released |
I think that the following fix would work: in the places where resubmit(,91) was replaced by NORMAL, we could do a resubmit(,99) (table 99 would be a new table). In this table 99, we would put the NORMAL action.
Then networking-bagpipe BGPVPN "router fallback" code would put its override rule in table 99 instead of table 91.