[NB Driver] FIP is not announced from OVN gateway when DVR is disabled

Bug #2056477 reported by Dmitriy Rabotyagov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ovn-bgp-agent
New
Critical
Jakub Libosvar

Bug Description

In a scenario, where enable_distributed_floating_ip is set to False (default behaviour), ovn-bgp-agent does not attempt to announce the FIP from the gateway node when it is expected to.

This happens due to acquiring `requested-chassis` field during filtering of watcher. So LogicalSwitchPortFIPCreateEvent class asks for chassis id:

https://opendev.org/openstack/ovn-bgp-agent/src/commit/c00139d5598324b037944f3560323d9de010a2ec/ovn_bgp_agent/drivers/openstack/watchers/nb_bgp_watcher.py#L176

Which returns the UUID of the compute node, rather then gateway chassis:
https://opendev.org/openstack/ovn-bgp-agent/src/commit/c00139d5598324b037944f3560323d9de010a2ec/ovn_bgp_agent/drivers/openstack/watchers/base_watcher.py#L114

From other side, to be fair, `ovn-nbctl list Logical_Switch_Port b9c80bef-7962-4db3-8b31-c0c53c75f9c5` does not contain any information that would map such FIP to a gateway - it's only mapped to the compute/VM:

# ovn-nbctl list Logical_Switch_Port b9c80bef-7962-4db3-8b31-c0c53c75f9c5
_uuid : b9c80bef-7962-4db3-8b31-c0c53c75f9c5
addresses : ["fa:16:3e:bc:d3:f0 172.16.0.227"]
dhcpv4_options : 1c71a1ff-3d3a-4e5c-91e1-9caa9d5c0c2a
dhcpv6_options : []
dynamic_addresses : []
enabled : true
external_ids : {"neutron:cidrs"="172.16.0.227/24", "neutron:device_id"="4d8eb566-76c0-4c50-8355-ae847cccf0bb", "neutron:device_owner"="compute:az1", "neutron:host_id"=os-compute02-az1, "neutron:network_name"=neutron-50854f6d-184e-4f19-a5cf-fbc2d51720f4, "neutron:port_capabilities"="", "neutron:port_fip"="ip.add.re.ss", "neutron:port_name"="", "neutron:project_id"="9def30ac22ec46fb82a6bb531d50023b", "neutron:revision_number"="4", "neutron:security_group_ids"="93f165b2-3969-4833-8771-ac1185c3dcfe", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal}
ha_chassis_group : []
mirror_rules : []
name : "0bc9af90-8e0b-4941-9720-794f5d41e440"
options : {requested-chassis=os-compute02-az1}
parent_name : []
port_security : ["fa:16:3e:bc:d3:f0 172.16.0.227"]
tag : []
tag_request : []
type : ""
up : true

description: updated
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Ok, actually I tried to go through the path of trying to fix that, or well, get so much needed chassis info for the FIP, and pretty much failed in doing so, being on top of stable/bobcat in terms of Neutron and with OVN 23.09.

So, I went on and checked NAT table. From NAT.external_ids we can take neutron:router_name.
But Logical_Router itself does not have binding to any chassis. Only Logical_Router_Port do have this. But then we need to somehow iterate over ports? To get the one that is binded and is a gateway for the network in topic?

But even then - binding is UUID, while self.agent.chassis is Name (FQDN). And Chassis table in NB is just empty for me. So really to get a mapping of Chassis UUID to it's name, I'd need to go to SB DB. But that is NB Agent, so it really should not go to SB DB.
And I got pretty much cornered at this point.

It also not saying, that there's no obvious way to get required NAT id. Though we could probably watch for it's events instead - then it's not an issue... But it really feels there should be more data populated in OVN to allow such flow working properly.

summary: - FIP is not announced from OVN gateway when DVR is disabled
+ [NB Driver] FIP is not announced from OVN gateway when DVR is disabled
Changed in ovn-bgp-agent:
importance: Undecided → Critical
Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

Right now the check is on LSP, checking the FIP information being added to the external_id of the LSP of the port that gets the FIP added.

I think we need to add an extra checking here:
- Check the NAT table associated to that logical port
- If it has external_mac set (meaning DVR is enabled), then continue as we do
- If it has no external_mac, meaning no DVR, then we need to check the gateway_port from the NAT entry
- with that ID, retrieve the Logical Router Port associated, and get the hosting-chassis at the status section

Changed in ovn-bgp-agent:
assignee: nobody → Jakub Libosvar (libosvar)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.