Add option to disable vrf leak to default

Bug #2056799 reported by Jay Jahns
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ovn-bgp-agent
New
Undecided
Unassigned

Bug Description

By default ovn_bgp_driver leaks the VRF to the default routing table. In the event that there is a separate transit network configured, this can cause asymmetric routing to occur, because incoming packets will be received, but outgoing packets will be sent off on a different interface.

Example:
- eth0: management, default route
- eth1: external interface, connected to br-ex
- a transit network was configured on vlan 10, which is trunked to eth1
- VM floating IPs and tenant networks need to be isolated from management and advertised out the transit network

This could be configurable through the conf and implemented in the jinja template to prevent this from occurring in the event that the networks need true isolation. It also allows the transit network to be the default route out, and prevents the need to setup policy based routing, which doesn't appear to work correctly in the frr implementation.

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

I'm not sure I completely understand this. That would require to learn routes on the computes right? or have some default routes to the CIDRs that you want use on eth1?

How do you make the transit network to be the default route out? the vrf leaking is about sending the routes on the nic/peer that was configured on the frr template, not about creating default routes in the local host,

I'm not sure if what you are looking for is the EVPN support that is being added here instead? https://review.opendev.org/c/openstack/ovn-bgp-agent/+/906505

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Well, I think I did have the same usecase, kindof.

So I did:

1. Create a new routing table:
echo "20 ext" >> /etc/iproute2/rt_tables

2. Added default route in the table, where 198.51.100.1 is basically a leaf
ip r add default via 198.51.100.1 dev eth1 scope link table ext

3. Created IP rule to forward public networks to this table, where 203.0.113.0/24 is a public network:
ip rule add from 203.0.113.0/24 table ext

Revision history for this message
Jay Jahns (jayjahns) wrote :

That's a policy based route, which works.

However, adding a default route to a table and a rule is not persistent across reboots.

Also - configuring PBR through FRR does not yield the same results.

In kolla environments, there is, by default, a management/api/overlay interface and an external interface.

If I want my default route to be out of the external interface, there are two ways to do it - place my BGP configuration entirely in the VRF and not leak the VRF, or use policy-based routing.

We'd like to avoid policy routes if at all possible.

Revision history for this message
Jay Jahns (jayjahns) wrote :

I think the better way to describe this item is enabling VRF isolation for the floating IPs and/or tenant networks that we do not want reaching the kernel routing table of the compute/control node.

Leaking the VRF directly into the default routing table poses a significant security risk where VMs with floating IPs can access the IP of the compute node directly.

Example:

2 NICs
- eth0 (management/api/tunnel)
- eth1 (bgp peering interface)

Assuming that the default route is eth0 for everything on the compute node, without a policy based route, traffic outbound from VM floating IPs would go out the eth0 interface. This happens irrespective of any BGP peering / default originate configuration.

The compute node can then access the floating IP directly, while the VM can access the compute node directly.

With a policy based route, we ensure that traffic outbound from the floating IP can go out eth1. However, VM can still reach the compute node (and everything on that network) directly.

There is a substantial concern here, where we need to prevent that from occurring.

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

ok, I think I understand your concern now. Now the BGP driver (with default exposing mechanism) is based on kernel routing, so it requires routes on the kernel and vrf leaking to do so. That said, I think there are 2 solutions for this:
- We have the ovn driver, which instead of using kernel routing, uses ovn routing, so in that case eth0 can be completely isolated from the BGP nics.
- never tried this, but you could have vrf leaking to another vrf, and make all your advertisement in an specific vrf (instead of the default one). It may require some (I hope minimal) changes into the code base, but would be nice addition

Revision history for this message
Jay Jahns (jayjahns) wrote :

FRR will allow leaking into a different VRF, but that doesn't appear to do what is needed, since br-ex relies on the kernel routing table to forward the traffic all the way through.

I think that what would have to happen here is management traffic on the node would have to be pinned in a similar fashion to how Cumulus does VRFs, and I don't know if that will work well with kolla-ansible.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.