[OVN] Neutron Service Restart Disrupts Octavia OVN Load Balancer Floating IP

Bug #2042938 reported by Bartosz Bezak
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Luis Tomas

Bug Description

Description:
Following the Neutron service restart, the Octavia OVN load balancer equipped with floating IPs stops processing traffic (DVR FIP is enabled)

Affected Versions:
- Neutron: Stable/Yoga (commit 03ed5578f7)
- Octavia: Stable/Yoga (commit 83955125)
- OVN: Version 22.12 (package ovn22.12-22.12.0-34.el9s.x86_64)

Details:
It looks like the 'external_mac' attribute for the Load Balancer's floating IP is changed when neutron is restarted.

It start working again when one does detach/attach FIP to LB VIP.

here is a before neutron server restart:

ovn-nbctl find NAT type=dnat_and_snat external_ip=128.130.194.85
_uuid : 438dff1d-3245-491c-a534-0a33e395ab2c
allowed_ext_ips : []
exempted_ext_ips : []
external_ids : {"neutron:fip_external_mac"="fa:16:3e:cb:96:e7", "neutron:fip_id"="36fb8531-3a33-453e-be46-64599976aa0e", "neutron:fip_network_id"="1f115e41-cc7f-4bb5-8493-995d035277dc", "neutron:fip_port_id"="9ef678e5-add1-47f5-ac7c-a23f4ee753bb", "neutron:revision_number"="50", "neutron:router_name"=neutron-96bd0fd6-bc77-4b17-a5b4-e79c20b448ce}
external_ip : "128.130.194.85"
external_mac : []
external_port_range : ""
gateway_port : []
logical_ip : "192.168.100.124"
logical_port : "9ef678e5-add1-47f5-ac7c-a23f4ee753bb"
options : {}
type : dnat_and_snat

after neutron server restart:

ovn-nbctl find NAT type=dnat_and_snat external_ip=128.130.194.8
_uuid : 028a9017-544b-49e6-82cb-86cb1bdfab54
allowed_ext_ips : []
exempted_ext_ips : []
external_ids : {"neutron:fip_external_mac"="fa:16:3e:f1:08:95", "neutron:fip_id"="3bbd5870-54ba-4941-a85a-29691d8b7aae", "neutron:fip_network_id"="1f115e41-cc7f-4bb5-8493-995d035277dc", "neutron:fip_port_id"="f2791464-91b6-4ace-8255-5e4aaf9785e0", "neutron:revision_number"="30", "neutron:router_name"=neutron-6d32d3de-38cc-4175-9764-5bdd1efbc20e}
external_ip : "128.130.194.8"
external_mac : "fa:16:3e:f1:08:95"
external_port_range : ""
gateway_port : []
logical_ip : "192.168.100.213"
logical_port : "f2791464-91b6-4ace-8255-5e4aaf9785e0"
options : {}
type : dnat_and_snat

here are the debug logs entries for this FIP when neutron restarts:

2023-11-07 12:46:26.165 28 DEBUG ovsdbapp.backend.ovs_idl.event [req-4ac48a1f-12b8-4d33-a05a-5ca856fe1895 - - - - -] Matched CREATE: LogicalSwitchPortCreateDownEvent(events=('create',), table='Logical_Switch_Port', conditions=(('up', '=', False),), old_conditions=None), priority=20 to row=Logical_Switch_Port(port_security=['fa:16:3e:54:5e:8c 192.168.3.236'], addresses=[], type=, dhcpv4_options=[<ovs.db.idl.Row object at 0x7f1a36a172b0>], name=a803471b-0254-4f66-911e-03a143c874b8, up=[False], options={'requested-chassis': ''}, ha_chassis_group=[], external_ids={'neutron:cidrs': '192.168.3.236/24', 'neutron:device_id': '', 'neutron:device_owner': '', 'neutron:network_name': 'neutron-eb3ed2db-3c79-4e78-92ac-5653f9abb094', 'neutron:port_fip': '128.130.194.113', 'neutron:port_name': 'ovn-lb-vip-c60d029a-7625-4a96-837d-46ba95f56507', 'neutron:project_id': 'ea03412a87bc426eae8b7badd15aa7a8', 'neutron:revision_number': '1', 'neutron:security_group_ids': 'f558ec6a-fcb7-4297-8737-11a1507099b0', 'neutron:subnet_pool_addr_scope4': '', 'neutron:subnet_pool_addr_scope6': ''}, dynamic_addresses=[], tag=[], parent_name=[], mirror_rules=[], tag_request=[], enabled=[True], dhcpv6_options=[]) old= matches /var/lib/kolla/venv/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/event.py:43

2023-11-07 12:46:26.209 28 DEBUG ovsdbapp.backend.ovs_idl.event [req-4ac48a1f-12b8-4d33-a05a-5ca856fe1895 - - - - -] Matched CREATE: FIPAddDeleteEvent(events=('create', 'delete'), table='NAT', conditions=(('type', '=', 'dnat_and_snat'),), old_conditions=None), priority=20 to row=NAT(external_ids={'neutron:fip_external_mac': 'fa:16:3e:17:d1:2c', 'neutron:fip_id': '840d24da-368d-4851-9b19-131c8f882f81', 'neutron:fip_network_id': '1f115e41-cc7f-4bb5-8493-995d035277dc', 'neutron:fip_port_id': 'a803471b-0254-4f66-911e-03a143c874b8', 'neutron:revision_number': '8', 'neutron:router_name': 'neutron-83f83d78-b4a5-4fe0-8e32-352f6d941583'}, external_ip=128.130.194.113, allowed_ext_ips=[], external_port_range=, exempted_ext_ips=[], logical_ip=192.168.3.236, type=dnat_and_snat, external_mac=['fa:16:3e:17:d1:2c'], options={}, logical_port=['a803471b-0254-4f66-911e-03a143c874b8'], gateway_port=[]) old= matches /var/lib/kolla/venv/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/event.py:43

2023-11-07 12:46:26.225 31 DEBUG ovsdbapp.backend.ovs_idl.event [req-985a5a2e-6838-45c1-9879-aa8bc0732d00 - - - - -] Matched CREATE: FIPAddDeleteEvent(events=('create', 'delete'), table='NAT', conditions=(('type', '=', 'dnat_and_snat'),), old_conditions=None), priority=20 to row=NAT(external_ids={'neutron:fip_external_mac': 'fa:16:3e:17:d1:2c', 'neutron:fip_id': '840d24da-368d-4851-9b19-131c8f882f81', 'neutron:fip_network_id': '1f115e41-cc7f-4bb5-8493-995d035277dc', 'neutron:fip_port_id': 'a803471b-0254-4f66-911e-03a143c874b8', 'neutron:revision_number': '8', 'neutron:router_name': 'neutron-83f83d78-b4a5-4fe0-8e32-352f6d941583'}, external_ip=128.130.194.113, allowed_ext_ips=[], external_port_range=, exempted_ext_ips=[], logical_ip=192.168.3.236, type=dnat_and_snat, external_mac=['fa:16:3e:17:d1:2c'], options={}, logical_port=['a803471b-0254-4f66-911e-03a143c874b8'], gateway_port=[]) old= matches /var/lib/kolla/venv/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/event.py:43

2023-11-07 12:46:42.564 31 DEBUG oslo_concurrency.processutils [req-e1654d04-1e37-44e2-a10b-16fc937ba532 - - - - -] Running cmd (subprocess): ovsdb-client transact tcp:10.20.1.11:6642,tcp:10.20.1.12:6642,tcp:10.20.1.13:6642 --timeout 180 ["OVN_Southbound", {"op": "delete", "table": "MAC_Binding", "where": [["ip", "==", "128.130.194.113"]]}] execute /var/lib/kolla/venv/lib/python3.9/site-packages/oslo_concurrency/processutils.py:384

2023-11-07 12:46:42.575 31 DEBUG oslo_concurrency.processutils [req-e1654d04-1e37-44e2-a10b-16fc937ba532 - - - - -] CMD "ovsdb-client transact tcp:10.20.1.11:6642,tcp:10.20.1.12:6642,tcp:10.20.1.13:6642 --timeout 180 ["OVN_Southbound", {"op": "delete", "table": "MAC_Binding", "where": [["ip", "==", "128.130.194.113"]]}]" returned: 0 in 0.011s execute /var/lib/kolla/venv/lib/python3.9/site-packages/oslo_concurrency/processutils.py:422

2023-11-07 12:46:49.623 28 DEBUG oslo_concurrency.processutils [req-c8da7baa-f23d-4a4d-aa5f-307ac4957ee6 - - - - -] Running cmd (subprocess): ovsdb-client transact tcp:10.20.1.11:6642,tcp:10.20.1.12:6642,tcp:10.20.1.13:6642 --timeout 180 ["OVN_Southbound", {"op": "delete", "table": "MAC_Binding", "where": [["ip", "==", "128.130.194.113"]]}] execute /var/lib/kolla/venv/lib/python3.9/site-packages/oslo_concurrency/processutils.py:384

2023-11-07 12:46:49.633 28 DEBUG oslo_concurrency.processutils [req-c8da7baa-f23d-4a4d-aa5f-307ac4957ee6 - - - - -] CMD "ovsdb-client transact tcp:10.20.1.11:6642,tcp:10.20.1.12:6642,tcp:10.20.1.13:6642 --timeout 180 ["OVN_Southbound", {"op": "delete", "table": "MAC_Binding", "where": [["ip", "==", "128.130.194.113"]]}]" returned: 0 in 0.010s execute /var/lib/kolla/venv/lib/python3.9/site-packages/oslo_concurrency/processutils.py:422

Bartosz Bezak (bbezak)
description: updated
description: updated
Revision history for this message
Matt Crees (mattcrees) wrote :

We've also found this to affect loadbalancers with the Amphora provider.
Again, using DVR FIP.

Revision history for this message
Bartosz Bezak (bbezak) wrote :

https://review.opendev.org/c/openstack/neutron/+/887625 - maybe this change altered behaviour for unbound ports (vip LB fip?)

Revision history for this message
Matt Crees (mattcrees) wrote :

I've tested and found that reverting the patch Bartosz has linked does resolve this issue.

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

@Matt @Bartosz,

Thanks for the investigation. Are you working on it ? Are you going to propose a revert on that patch ?

Changed in neutron:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/900668

Revision history for this message
Luis Tomas (luis5tb) wrote :

the logs pasted look wrong, as the IPs/FIPs for before and after the restart are different, even the router on the external ids

Revision history for this message
Bartosz Bezak (bbezak) wrote :

you're right Luis - I've made a mistake when doing copy paste - treat those as an example - as a problem exists

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/900647

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Luis Tomas (luis5tb) wrote :

I've created this instead as a solution for this issue: https://review.opendev.org/c/openstack/neutron/+/900647

It would be great if you can give it a try/look and confirm it works for you

Changed in neutron:
assignee: nobody → Luis Tomas (luis5tb)
tags: added: ovn ovn-octavia-provider
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Bartosz Bezak <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/900668
Reason: https://review.opendev.org/c/openstack/neutron/+/900647

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900647
Committed: https://opendev.org/openstack/neutron/commit/f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a
Submitter: "Zuul (22348)"
Branch: master

commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/900881

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/900882

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/900883

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/900885

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/900886

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900882
Committed: https://opendev.org/openstack/neutron/commit/fb99b7bfac17790816db9bcd073cea90b897cebc
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit fb99b7bfac17790816db9bcd073cea90b897cebc
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900883
Committed: https://opendev.org/openstack/neutron/commit/15d5db0d6e0a563511f13bc6b014bf07c7c8db38
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 15d5db0d6e0a563511f13bc6b014bf07c7c8db38
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)

tags: added: in-stable-zed
tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900885
Committed: https://opendev.org/openstack/neutron/commit/bef1f7a4861a1d19e3a9003ffed303dd47b70707
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit bef1f7a4861a1d19e3a9003ffed303dd47b70707
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/900982

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900881
Committed: https://opendev.org/openstack/neutron/commit/c03d76a41db2e4dcf0beb829af01762307ecadaa
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit c03d76a41db2e4dcf0beb829af01762307ecadaa
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900982
Committed: https://opendev.org/openstack/neutron/commit/86aa8a3a1031cbc6476399308a8f6344e9383652
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 86aa8a3a1031cbc6476399308a8f6344e9383652
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)
    (cherry picked from commit bef1f7a4861a1d19e3a9003ffed303dd47b70707)

tags: added: in-stable-xena
tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900886
Committed: https://opendev.org/openstack/neutron/commit/88245f8db18ed5ff72787f3e2e4d146e113361e4
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 88245f8db18ed5ff72787f3e2e4d146e113361e4
Author: Luis Tomas Bolivar <email address hidden>
Date: Mon Nov 13 16:42:51 2023 +0100

    Ensure ovn loadbalancer FIPs are centralized upon neutron restarts

    When neutron server restarts the mac address for NAT entries related
    to ovn-lb FIPs gets re-added, distributing the traffic that should
    be centralized and therefore breaking the connectivity. This happens
    due to the port being down. This patch is ensuring the MAC entry
    is only being readded in case the port is UP

    Closes-Bug: #2042938
    Change-Id: I6203009750a4e589eeb808f842cb522d61476179
    (cherry picked from commit f2a3020cf0a46dbd896c5f7b4b4f6643d32a6b4a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.1.0

This issue was fixed in the openstack/neutron 22.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.1.0

This issue was fixed in the openstack/neutron 23.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.5.0

This issue was fixed in the openstack/neutron 20.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.2.0

This issue was fixed in the openstack/neutron 21.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 24.0.0.0b1

This issue was fixed in the openstack/neutron 24.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom

This issue was fixed in the openstack/neutron wallaby-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron xena-eom

This issue was fixed in the openstack/neutron xena-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.