Exposing IPv6 IPs on the provider networks not working

Bug #2020410 reported by Luis Tomas Bolivar
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ovn-bgp-agent
Confirmed
High
Unassigned

Bug Description

Exposing IPv6 IPs on the provider networks does not work unless an OVN router is plugged into that network and responses to the NS requests

Changed in ovn-bgp-agent:
status: New → Confirmed
Revision history for this message
Maximilian Sesterhenn (msnatepg) wrote :

Hello Tomas,

I will answer here as I think it fits better.

Your observations seems legit, however I'm unable to test in my scenario as my code is not yet ready to work with routers properly.
However, your log output shows that the default route of your public network is not br-ex, instead it's the router.
That would explain the behavior.

To me that's a limitation of the proxy_ndp implementation in the Linux kernel, perhaps its a feature too :P

Yesterday, I made the suggestion to add the gateway to networking-bgpvpn to be able to add that to the proxy_ndp configuration. However, I realized that this would work for the default route, however we have to route all traffic, including traffic for the public network itself (other instances on the same network).
This is something that is difficult to achieve with proxy_ndp.
We really need some catch-all logic here.

As proxy_ndp is not providing that functionality I got the idea to implement an ICMPv6 NS/NA Responder using OVS flows.
ODL has a very similar spec [1].
I wasn't able to test this yesterday, because my OVS deployment seems to be too old to support all fields that would be necessary for a complete NA packet.
As I'm not that familiar with OVS and OpenFlow I've sent a message to ovs-discuss, maybe someone there is able to help me.

[1] https://docs.opendaylight.org/projects/netvirt/en/latest/specs/fluorine/ovs_based_na_responder_for_gw.html

Changed in ovn-bgp-agent:
importance: Undecided → High
Revision history for this message
Maximilian Sesterhenn (msnatepg) wrote :
Download full text (3.4 KiB)

Hello all,

While evaluating the new EVPN functionality in the NB BGP driver, I stumbled across some feature limitations.
To be honest, I totally forgot about this issue, but it seems that it's still a problem.

As OVN / OVS is actually breaking out an L2 circuit to br-ex, we have to reply to ARP and NDP with our own MAC to be able to route the traffic.
This is not new and expected.

For IPv4, proxy_arp is enabled and answers to all ARP requests, so that we can route all the traffic.
In addition, the driver adds the IP of the GW (that was grabbed from DHCP_Options) to the local interface.
While I assume that this is not strictly necessary because proxy_arp is enabled anyways, it allows for external reachability of the GW IP as well.

For IPv6, the situation is a bit more difficult.
The driver configures FRR to do RA announcements, which instructs the VM to add an IP from an LLA prefix to their local interface.
Using that, the instance would then use the LLA IP of FRR as their default GW and not the default GW that is configured in OpenStack.
In contrast to IPv4, this allows only for reachability of the default GW and not other hosts in the same subnet which reside outside of OVN.
In addition, this behavior is neither really transparent to the user nor consistent to IPv4.

In my setup, I observed different behaviors of different images.
Some, like Rocky 9, seem to honor the gateway configuration from the OpenStack metadata endpoint and install the wrong default route.
Some others, like Ubuntu 22.04, seem to ignore that and just use what's advertised as part of the FRR RA announcement.

There could be a couple of solutions to these problems:

The easiest way would be to add the IP of the public GW to the local interface, just like in IPv4.
While I think it's not strictly necessary in IPv4 due to proxy_arp, it would be necessary here.
This would assure communication to the default GW, but not to hosts external to OVN from the same VNI.
Unfortunately, the agent has no knowledge of the public GW as this information is grabbed from the DHCP_Options table.
There, the gateway information exists for IPv4, but not for IPv6.

proxy_ndp could also be a solution and is already enabled, but it does not reply to any requests until each address that you want an NA answer for is added by doing:
ip -6 neigh add proxy <ip> dev <interface>
This could be done for the default GW if DHCP_Options would carry that default GW but to route all traffic, we would have to do this for each local destination in e.g. a /64.
That's not realistic.

My idea in the past was to replace all the proxy_arp and proxy_ndp functionality by OVS flows.
Unfortunately, the kernel datapath in OVS is missing a critical feature to set the nd_options_type field to allow for crafting a valid ICMPv6 NA packet:
https://<email address hidden>/msg46880.html
https://review.opendev.org/c/openstack/ovn-bgp-agent/+/884169

I could imagine a solution that would combine OVS and proxy_ndp.
First, we would use an OVS flow in br-ex to change all ICMPv6 NS requests to a common, known IP.
Like in IPv4, we want to answer all of them with our own MAC anyway, so it doesn't really matter.
T...

Read more...

Revision history for this message
Luis Tomas (luis5tb) wrote :

Did you try adding a router to the provider network (at least as a workaround until this gets figured out)?

Do you have dhcpv6 options?
With this: "In contrast to IPv4, this allows only for reachability of the default GW and not other hosts in the same subnet which reside outside of OVN.", you mean you don't have reachability to IPs on the provider network range, but that are unknown to OpenStack/OVN, right?

I've been on discussions with other folk about a more "BGP/EVPN native" way of implementing both l2 and l3 vnis. I'll ping him for his input here.

Revision history for this message
Maximilian Sesterhenn (msnatepg) wrote :

Yes, we've multiple routers running in that provider network.

What do you mean by DHCPv6 options? IPv6 RA Mode and IPv6 Address Mode are both set to dhcpv6-stateful in the OpenStack subnet object.

Exactly, IPs on the provider subnet/VNI, but outside of OVN. The instance sends NS requests, but no one is answering them.

Im curious about your discussions with other folks. Maybe you can share some ideas?

Revision history for this message
Tore Anderson (toreanderson) wrote :

Hi, ltomasbo asked I shared some thoughts here.

So the way I see it, problems such as these are a result of doing things in a rather non-standard way to begin with, rather than implementing EVPN support in a more normal way.

The fundamental problem is that ovn-bgp-agent does not implement L2VNI support (https://bugs.launchpad.net/ovn-bgp-agent/+bug/2017890). I'll try to explain:

Because of the lack of L2VNI support, there is no L2 connectivity between nodes on the same provider net residing on different hypervisors. However the nodes (be it routers/cr-lrps or VMs) residing on the provider networks certainly do have an expectation there should be L2 connectivity - to them, it's just a regular VLAN, after all.

So instead, ovn-bgp-agent ends up having to resort to various dirty tricks and hacks (like ARP/ND proxy), all in order to mask the lack of L2 connectivity and make it work somehow. Unfortunately, as often happens when relying on hacks like these, some kind of functionality is catered to correctly, requiring more hacks and tricks, and so on. Even if you make it all work somehow, the result gets really complicated and hard to debug.

So what I've proposed is a more fundamental rethink of how it all fits together. In a nutshell: get rid of all the the hacks, replace them with regular L2VNIs. This restores L2 connectivity on provider networks between hypervisors (and also between hypervisors and devices external to OpenStack), allowing ARP/ND to work normally. That means no more need for proxy ARP/ND, ip rules, static host routes, or whatever else black magic ovn-evpn-agent has needed to do before.

I made a demo/lab for luis5tb that shows how it could be done, described in more detail here: https://drive.redpill-linpro.com/s/xs3WpLQmPTNAMMa

If you're interested in taking a look at the lab, msnatepg, just send me an SSH pubkey and I'll add it to the authorized_keys files on the nodes.

Revision history for this message
Maximilian Sesterhenn (msnatepg) wrote :

Hi Tore,

First, thanks for your extensive investigation regarding the EVPN implementation in ovn-bgp-agent.
Also, a big thanks for the opportunity to get access to your lab.

I've worked through your lab and made a drawing to get a better understanding.
If that's helpful to others, I can share it with you so that you can integrate it into your lab description.

As far as I understand you propose the following changes:

- Implement support for L2VNIs

- Remove proxy_arp and proxy_ndp from the L3 implementation.
- Instead, configure an L2VNI even in L3 mode to allow for connectivity between hosts on the same network.
- In addition, add the Gateway IP as an Anycast GW to an IRB interface on the hosts, this will allow for connectivity between networks and to the Internet.

While only L2 routes will be used for connectivity in a network, this will allow for L3 routes where we can natively route them.
The current L3 patch is missing the L2VNI, but it already configures the GW IP onto the host for IPv4.

This feature is limited to networks with active DHCP, as the GW IP is grabbed from the DHCP_Options table of the OVN NB DB.
For IPv6, there is no GW IP in DHCP_Options in all cases, that's why FRR is configured to do Router Announcements.

Tests showed that this is not working as expected in all cases, as FRR is announcing an LLA GW IP while OpenStack is still providing the configured GW from Neutron.
Some images seem to honor the RA from FRR, while others seem to honor the configuration from Neutron.
I've tested with Rocky and Ubuntu and got different results.
Those who install the GW from Neutron will have a broken connection since this GW IP is not configured on the host, and proxy_ndp would also need an explicit configuration to answer to NS requests.

To avoid this, I updated my networking-bgpvpn patch [1] to also include IPAM information from Neutron into the external_ids of the Logical_Switch table, regardless of the IP version or DHCP enabled or not.
Once integrated into the L3 implementation, this would allow us to configure the GW IPs for both IP versions on the host and disable the FRR RAs and proxy_arp / proxy_ndp.

Connectivity between hosts on the same network (the L2VNI) either in L2 or L3 mode is not part of this and is still an open topic.

Let me know if something of my understanding is wrong.

Maximilian

[1]: https://review.opendev.org/c/openstack/networking-bgpvpn/+/883060

Revision history for this message
Tore Anderson (toreanderson) wrote :

Hello Maximilian,

Please, do share the lab drawing!

Your understanding is correct. With an L2VNI in place, there is no longer any need for hacks such proxy_arp/ndp.

Note that the anycast gateway IRB should be optional, perhaps enabled only if an "l3vni" annotation or some such exists in the OVN DB. It could very well be that some physical firewall appliance or whatever connected to a physical switch external to OpenStack (reached via the L2VNI) is owning the gateway IP, if so the IRB should not be created on they hypervisor with the anycast IP/MAC.

Ideally this feature should not be limited to networks with DHCP enabled.

I note that the subnet objects in the OpenStack database contains all the necessary information for configuring the IRB, in particular the fields "cidr" and "gateway_ip" fields, as well as the "provider:segmentation_id" and "mtu" fields on the parent network object.

Also, all the necessary information for emitting correct ICMPv6 RAs (IFF the anycast gateway IRB is active) is present, in particular the "dns_nameservers", "ipv6_address_mode" and "ipv6_ra_mode" fields.

I do not know enough about the OVN databases to determine if all this information is also available to ovn-bgp-agent. I guess that's what your updated patch to networking-bgpvpn does, maybe?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.