DVR with static routes may cause routed traffic to be dropped

Bug #1794569 reported by Peter Slovak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Neutron version: 10.0.7
Network scenario: Openvswitch with DVR
Openvswitch version: 2.6.1
OpenStack installation version: Ocata
Operating system: Ubuntu 16.04.5 LTS
Kernel: 4.4.0-135 x86_64

Symptoms:
Instances whose default gateway is a DVR interface (10.10.255.1 in our case) occassionaly lose connectivity to non-local networks. Meaning, any packet that had to pass through the local virtual router is dropped. Sometimes this behavior lasts for a few milliseconds, sometimes tens of seconds. Since floating-ip traffic is a subset of those cases, north-south connectivity breaks too.

Steps to reproduce:
- Use DVR routing mode
- Configure at least one static route in the virtual router, whose next hop is NOT an address managed by Neutron (e.g. a physical interface on a VPN gateway; in our case 10.2.0.0/24 with next-hop 10.10.0.254)
- Have an instance plugged into a Flat or VLAN network, use the virtual router as the default gateway
- Try to reach a host inside the statically-routed network from within the instance

Possible explanation:
Distributed routers get their ARP caches populated by neutron-l3-agent at its startup. The agent takes all the ports in a given subnet and fills in their IP-to-MAC mappings inside the qrouter- namespace, as permanent entries (meaning they won't expire from the cache). However, if Neutron doesn't manage an IP (as is the case with our static route's next-hop 10.10.0.254), a permanent record isn't created, naturally.

So when we try to reach a host in the statically-routed network (e.g. 10.2.0.10) from inside the instance, the packet goes to default gateway (10.10.255.1). After it arrives to the qrouter- namespace, there is a static route for this host pointing to 10.10.0.254 as next-hop. However qrouter- doesn't have its MAC address, so what it does is it sends out an ARP request with source MAC of the distributed router's qr- interface.

And that's the problem. Since ARP requests are usually broadcasts, they land on pretty much every hypervisor in the network within the same VLAN. Combined with the fact that qr- interfaces in a given qrouter- namespace have the same MAC address on every host, this leads to a disaster: every integration bridge will recieve that ARP request on the port that connects it to the Flat/VLAN network and learns that the qr- interface's MAC address is actually there - not on the qr- port also attached to br-int. From this moment on, packets from instances that need to pass via qrouter- are forwarded to the Flat/VLAN network interface, circumventing the qrouter- namespace. This is especially problematic with traffic that needs to be SNAT-ed on its way out.

Workarounds:
- The workaround that we used is creating stub Neutron ports for next-hop addresses, with correct MACs. After restarting neutron-l3-agents, they got populated into the qrouter- ARP cache as permanent entries.
- Next option is setting the static route into the instances' routing tables instead of the virtual router. This way it's the instance that makes ARP discovery and not the qrouter- namespace.
- Another workaround might consist of using ebtables/arptables on hypervisors to block incoming ARP requests from qrouters.

Possible long-term solution:
Maybe it would help if ancillary bridges (those connecting Flat/VLAN network interfaces to br-int) contained an OVS flow that drops ARP requests with source MAC addresses of qr- interfaces originating from the physical interface. Since their IPs and MACs are well defined (their device_owner is "network:router_interface_distributed"), it shouldn't be a problem setting these flows up. However I'm not sure of the shortcomings of this approach.

description: updated
Revision history for this message
Nate Johnston (nate-johnston) wrote :

Marking this 'invalid' since, as you suggest, Neutron 9.4.1 (Newton) reached end of life 10/25/2017, and is no longer supported upstream. If you believe this to still be an issue in master then please recomment and I will change status appropriately.

Changed in neutron:
status: New → Invalid
Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

This behavior may be applicable also for master branch.
We do have an OVS flow rule in the br-tun bridge, but not in the br-int.

You have mentioned that the ARP request is being populated or sent out and it reaches all hypervisors. ( Not sure how it gets propagated). Is this just because of vlan/flat it does not hit the br-tun rule.

tags: added: l3-dvr-backlog
Revision history for this message
Peter Slovak (slovak-peto) wrote :

Yes Swaminathan, the ARP request originating from qrouter- gets propagated to all hypervisors precisely because the router's qr- interface is in a Flat or VLAN network. Network infrastructure takes care of the propagation then.

I believe br-tun and tunnel networks are safe because a) packets are processed and decapsulated by flows (not NORMAL ovs rules that cause the MAC learning) and b) many deployments use arp_responder that prevents the ARP from broadcasting into the tunnel. But these are just my assumtions, I haven't tested this scenario.

Also, I believe we'll be able to test the bug in a currently supported release in a couple of months, after we upgrade. If someone could run this test in a lab environment until then, that would be great.

Revision history for this message
Peter Slovak (slovak-peto) wrote :

Sorry for the late comment, but this issue is still present in Ocata. Unfortunately I can't confirm/deny its presence for later versions just yet, but if it would have been fixed, I don't see why the patch hadn't been ported to Ocata -> so I'm assuming it's present in later versions, too, unless it has been fixed silently.

Anyway, could you please reopen this Nate?

I think in the meantime, we'll have to resort to an ebtables rule hotfix to just drop ARP packets with the virtual router gateway source MAC address coming from "outside" the host. Turns out that a rogue ARP response may come from the other DVRs even when not using static routes in routers - but that's normal, since every once in a while, a host with its default GW set to the router's interface will ask for its MAC.

tags: added: arp fip floatingip vlan
Changed in neutron:
status: Invalid → New
description: updated
Revision history for this message
Peter Slovak (slovak-peto) wrote :

We hotfixed the issue with adding a flow to the particular flat network bridge (e.g. br-netX, not br-int). The flow basically drops all traffic coming from the physical interface, having a source MAC address of the distributed router's gateway interface.

However be aware that this flow may only be added on compute nodes, not controllers (or, specifically, network nodes; we use controllers as net nodes). This is because network nodes perform SNAT for north/south traffic, and because of this, any incoming traffic from the DVR's gateway MAC address is legitimate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.