[OVN] A route inferred from a subnet's default gateway is not added to ovn-nb if segment_id is not None for a subnet

Bug #2003842 reported by Dmitrii Shcherbakov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Medium
Unassigned

Bug Description

Context:

* Neutron is configured to use OVN
* An external provider network with one segment is created
* A subnet with a default gateway IP set is associated with this segment explicitly (segment_id != None)
* A router's gateway port is set to use the provider network (external_gateway_info is set with a network_id passed)

Result: OVN NB does not contain a default route and instance traffic is blackholed.

--
Detailed description:

The first time a external gateway info is set as follows

$ openstack router set --external-gateway pubnet r1

does not result in OVN getting a default route with the next-hop set to the subnet's gateway IP:

$ sudo ovn-nbctl list logical_router_static_route ; echo $?
0

Doing it twice in a row does (the default route appears in the table after the second command):

$ openstack router set --external-gateway pubnet r1 && openstack router set --external-gateway pubnet r1

$ sudo ovn-nbctl list logical_router_static_route
_uuid : df7c6020-83e7-446c-8f5c-31db96eb2dd3
bfd : []
external_ids : {"neutron:is_ext_gw"="true", "neutron:subnet_id"="abdae752-034c-4845-b6b3-92bf40cf24a6"}
ip_prefix : "0.0.0.0/0"
nexthop : "10.1.1.1"
options : {}
output_port : []
policy : []
route_table : ""

The inferred route is normally installed by this portion of code:
https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1264-L1279

Based on the result from _get_gw_info:
https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1197-L1204

`_get_gw_info` returns an empty list since `external_fixed_ips` is an empty list:

self._l3_plugin.get_router(context, 'd51ec4b0-c847-41e0-b43d-5dbf8ddcca32')
{'id': 'd51ec4b0-c847-41e0-b43d-5dbf8ddcca32', 'name': 'r1', 'tenant_id': 'dbfcc6c6a50f481685fda546abd00cd3', 'admin_state_up': True, 'status': 'ACTIVE', 'external_gateway_info': {'network_id': 'eef0120b-d01f-4cf7-9d1a-65f1da1eb67c', 'external_fixed_ips': [], 'enable_snat': True}, 'gw_port_id': '2da99728-b04e-4a4f-ac6f-d0930de8264a', 'description': '', 'availability_zones': [], 'distributed': False, 'ha': False, 'ha_vr_id': 0, 'availability_zone_hints': [], 'routes': [], 'tags': [], 'created_at': '2023-01-20T09:45:55Z', 'updated_at': '2023-01-24T12:44:14Z', 'revision_number': 35, 'project_id': 'dbfcc6c6a50f481685fda546abd00cd3'}

Meanwhile, the `external_fixed_ips` field is empty because of the deferred IPAM logic triggered by the presence of `segment_id != None` for the subnet on the external network. Based on this logic, the port is unbound and does not get an IP allocation until a port update & port binding:

https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/objects/subnet.py#L341-L343 (subnets attached to segments are excluded if a host isn't known)
https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/objects/subnet.py#L481-L486 (ipam_exceptions.DeferIpam is raised)
https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/db/db_base_plugin_v2.py#L1472-L1478 (DeferIpam is caught and the port gets IP_ALLOCATION_NONE for its IP allocation as it has no fixed ips.

Port state after it gets created in the unbound state (the code trying to add a default route is trying to find fixed IPs at the same time the gateway port is unbound and does not have any):

openstack port list --router r1
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| ID | Name | MAC Address | Fixed IP Addresses | Status |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| 2da99728-b04e-4a4f-ac6f-d0930de8264a | | fa:16:3e:eb:cf:76 | | DOWN |
| 97d604f2-addb-46b8-9eaf-745257dddb2f | | fa:16:3e:c8:73:8b | ip_address='192.168.0.1', subnet_id='89227e7b-d2b0-4953-afe7-2b471736f85a' | ACTIVE |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+

openstack port show 2da99728-b04e-4a4f-ac6f-d0930de8264a
+-------------------------+--------------------------------------+
| Field | Value |
+-------------------------+--------------------------------------+
| admin_state_up | UP |
| allowed_address_pairs | |
| binding_host_id | |
| binding_profile | |
| binding_vif_details | |
| binding_vif_type | unbound |
| binding_vnic_type | normal |
| created_at | 2023-01-24T12:42:44Z |
| data_plane_status | None |
| description | |
| device_id | d51ec4b0-c847-41e0-b43d-5dbf8ddcca32 |
| device_owner | network:router_gateway |
| device_profile | None |
| dns_assignment | None |
| dns_domain | None |
| dns_name | None |
| extra_dhcp_opts | |
| fixed_ips | |
| id | 2da99728-b04e-4a4f-ac6f-d0930de8264a |
| ip_allocation | deferred |
| mac_address | fa:16:3e:eb:cf:76 |
| name | |
| network_id | eef0120b-d01f-4cf7-9d1a-65f1da1eb67c |
| numa_affinity_policy | None |
| port_security_enabled | False |
| project_id | |
| propagate_uplink_status | None |
| qos_network_policy_id | None |
| qos_policy_id | None |
| resource_request | None |
| revision_number | 1 |
| security_group_ids | |
| status | DOWN |
| tags | |
| trunk_details | None |
| updated_at | 2023-01-24T12:42:44Z |
+-------------------------+--------------------------------------+

Tested on Yoga, references are for the master branch.

Tags: ovn
tags: added: ovn
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Dmitrii:

This is not an OVN error but an error during the router GW port creation. When the router GW port is created in first place, the port host is not assigned yet. This is because the port has not been bound yet. Because of [1], the IPAM module doesn't return any valid subnet and thus the GW port has not fixed IPs on any subnet. This is what you have already described in the definition.

This is because the routed provided networks functionality, that allows to have multiple segments attached to a network, it is supposed to leverage the L3 routing capability of the underlying network. That means the routing processing is done outside Neutron. What Neutron needs to configure is the host segments according to the physical deployment.

In other words, a Neutron router can't be connected as a GW router on a routed provider network. In fact [3] applies only to ML2/OVS. This functionality, as you have seen, doesn't work with ML2/OVN.

Regards.

[1]https://github.com/openstack/neutron/blob/21927e79075ce0f3e521e56fca0bed8f1de61066/neutron/objects/subnet.py#L332-L343
[2]https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html
[3]https://review.opendev.org/c/openstack/neutron/+/791178

Changed in neutron:
status: Confirmed → Invalid
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hi Rodolfo,

Yes, I understand the unbound port logic with routed provider networks is there to defer IP allocation until the moment when the segment (and its subnet) is known.

Do you think this LP is "invalid" because it's part of a new feature rather than a bug?

I have seen a spec merged for floating IPs on routed provider networks which implies attaching routers to routed provider networks:

https://review.opendev.org/c/openstack/neutron-specs/+/486450/17/specs/victoria/routed-networks-floating-ips.rst
https://review.opendev.org/c/openstack/neutron/+/669395/
https://review.opendev.org/q/topic:bug%252F1667329

So this would be a scenario in presence of the L3 leaf & spine topology but requires a more advanced integration & dynamic routing between the fabric and OpenStack.

I understand if the functionality like this would make sense to add during a new feature development (potentially separately for ML2/OVS and ML2/OVN) for the referenced spec.

Where I was going with this is the following: OVN has built-in reachability checks [**] for next-hops of routes added to the router (Logical_Router_Static_Routes NB table) so even if there are static routes added to a router OVN will be able to determine which ones to actually have processing for in the form of logical flows - so we won't get blackholing of traffic because of unreachable next-hops of extra routes or ECMP default routes with next-hops on different segments.

OVN also includes `ecmp-symmetric-reply` functionality (not yet used in Neutron) that utilizes conntrack which could be enabled for avoiding asymmetric replies in presence of ECMP routes (for the non-distributed routing case since this relies on conntrack):

https://github.com/ovn-org/ovn/blob/v22.12.0/ovn-nb.xml#L3312-L3319
https://github.com/ovn-org/ovn/commit/4fdca656857d4a5caeec35ae813888cb9e403e5e

But before any of this can be used, the fundamental problem is the lack of an added inferred default route which I wanted to have a record of in the form of this bug.

=====
[**]

When OVN adds logical flows for routes (whether ECMP or not)

https://github.com/ovn-org/ovn/blob/1207ae69f358c515f01a1c4451864cad6ca23406/northd/northd.c#L10377-L10389 build_static_route_flow
https://github.com/ovn-org/ovn/blob/1207ae69f358c515f01a1c4451864cad6ca23406/northd/northd.c#L10228-L10275 build_ecmp_route_flow

it does an overlap and reachability check - if a next-hop of a route is not reachable via a port (based on its assigned IP and direct connectivity of a subnet), logical flows for it are not added.

https://github.com/ovn-org/ovn/blob/1207ae69f358c515f01a1c4451864cad6ca23406/northd/northd.c#L9997-L10016 (find_static_route_outport)

This is similar to selective addition of routes to the routing table of an L3 namespace with ML2/OVS.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.