[RFE] The EVPN driver does not advertise floating IPs or router (SNAT) addresses
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ovn-bgp-agent |
Fix Released
|
Wishlist
|
Unassigned |
Bug Description
The EVPN driver does not advertise the router's external gateway address nor floating IPs into EVPN. Tenant traffic that undergoes NAT (either router-based SNAT or floating-IP based SNAT+DNAT) will therefore not work, because the VRF in the external network does not have a return route to the IP in question.
This is the routing setup with a simple test setup with the following:
- An external network for allocation of router IPs and floating IPs. This is a VLAN-type provider network, but br-ex is not connected to any physical network device. Subnet prefixes are 87.238.54.0/23 and 2a02:c0:1:99::/64
- A Geneve-based tenant network inside OVN. Subnet prefixes are 10.42.42.0/24 and 2001:db8:42:42::/64
- A VM connected to the tenant network with IPs 10.42.42.123 and 2001:db8:42:42::344
- A floating IP in the external network associated with the VM: 87.238.54.105
- An OpenStack router connecting the external network with the tenant network.
- VRF 4041 which is corresponds to the "internet" VRF in the upstream network.
The router/gateway chassis and the VM are co-located on the same compute node.
The router's dual-stack port on the external network and two single-stack ports on the internal network (one ipv4, one ipv6) have all had the "neutron_
This is the routing that ovn-bgp-agent created in the VRF is as follows:
[tore@node31-
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF vrf-4041:
K>* 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 2d18h41m
C>* 10.42.42.123/32 is directly connected, lo-4041, 2d18h40m
K>* 87.238.54.89/32 [0/0] is directly connected, vlan-4041, 2d18h41m
B>* 100.64.0.0/29 [20/0] via 87.238.63.33, br-4041 onlink, weight 1, 00:01:31
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF vrf-4041:
K>* ::/0 [255/8192] unreachable (ICMP unreachable) (vrf default), 2d18h41m
C>* 2001:db8:
K>* 2a02:c0:
C * fe80::/64 is directly connected, vlan-4041, 2d18h41m
C * fe80::/64 is directly connected, br-4041, 2d18h41m
C>* fe80::/64 is directly connected, lo-4041, 2d18h41m
The IP addresses assigned to the VM (10.42.42.123 and 2001:db8:
However, the router's IP addresses (87.238.54.89/32 and 2a02:c0:
[tore@node31-
11: lo-4041: <BROADCAST,
link/ether 2e:c2:53:a6:ca:5f brd ff:ff:ff:ff:ff:ff
inet 10.42.42.123/32 scope global lo-4041
valid_lft forever preferred_lft forever
inet6 2001:db8:
valid_lft forever preferred_lft forever
They get advertised into EVPN due to the redistribute connected stanza in the VRF template added to the FRR configuration.
The router addresses, on the other hand, are instead being added as routes:
[tore@node31-
10.42.42.0/24 via 87.238.54.89
87.238.54.89 scope link
[tore@node31-
2001:db8:42:42::/64 via 2a02:c0:1:99::2c4 metric 1024 pref medium
2a02:c0:1:99::2c4 metric 1024 pref medium
This means that they are kernel routes (as opposed to connected routes, cf. the C> and K> prefixes in the FRR route output above), and they are therefore not being advertised.
Furthermore, the floating IP address (87.238.54.105) is nowhere to be seen. It's not added as an address on lo-4041 (like the VM address is), nor as a route to vlan-4041 (like the router address is). So this will obviously not work either.
Now, before you object that it makes no sense to both advertise the internal tenant-net IPv4 address of the VM while at the same time using SNAT and floating IPs, I kind of agree. However, removing the neutron_bgpvpn annotations from the router's IPv4-only tenant network port makes no difference - the floating IP address and the router's external/SNAT address are still not being advertised. It does however remove the advertisement of the VM's IPv4 address (10.42.42.123) as it it is removed from lo-4041, as expected. The route 10.42.42.0/24 via 87.238.54.89 on vlan-4041 also vanishes.
I note that this behaviour differs significantly from that of the BGP driver. With the BGP driver, when using the exact same setup, both the floating IP and the router IPs (both IPv4 and IPv6) are added to the bgp-nic interface (which is then imported into the underlay/default VRF thanks to import vrf bgp-vrf and advertised in BGP there).
Ideally, the EVPN driver would be capable of doing everything the BGP driver did, only that it additionally was VRF-aware so that the addresses in question could be advertised into a VRF instead of into the underlay/default VRF. The BGP driver does of course work very well if the operator does not use EVPN and VRFs (in that case, the default VRF is the internet VRF and there is no underlay), but when the operator uses EVPN, the default VRF is usually the underlay VRF, which most likely does not have any internet/external access whatsoever, so advertising SNAT/floating IPs there makes no sense at all.
Changed in ovn-bgp-agent: | |
importance: | Undecided → Wishlist |
Is someone already working on that?
We have started to do some refactoring to make the EVPN driver capable of the features that at the moment only the BGP driver has.
If there is ongoing effort we could stop that, otherwise I am more than happy to create a WIP change once we have a POC ready.
Or is the overall intention more to realize this in an upcoming reimplementation of the EVPN driver using the OVN NB DB?