Wrong ip placement after live migration of instance

Bug #2049902 reported by Michel Nederlof
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ovn-bgp-agent
Fix Released
Medium
Unassigned

Bug Description

When using the NB_ovn_bgp_driver, the placement of a floating ip after live migration is not determined properly.

During investigation we found out that the method responsible for determining the primary chassis is primarily looking at neutron:host_id field in external_ids, while the requested-chassis field in options column should overrule the placement of an lsp.

https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337623.html

I'll add some examples in the comments for live migration sequence

Revision history for this message
Michel Nederlof (mnederlof) wrote :

Original lsp (before live migration)

```
_uuid : 0eb19cff-b062-4eee-96b2-8d5119fb3ecf
addresses : ["fa:16:3e:a9:a6:75 2001:db8:1200::148 198.51.100.220"]
dhcpv4_options : b2afab41-f805-4445-bd55-df9271355f40
dhcpv6_options : 5119edfb-ea51-4c4e-81e9-66b56c4f7388
dynamic_addresses : []
enabled : true
external_ids : {"neutron:cidrs"="2001:db8:1200::148/64 198.51.100.220/22", "neutron:device_id"="6a6db2eb-bcc6-4737-9e33-396780e17b96", "neutron:device_owner"="compute:DC1", "neutron:host_id"=compute1, "neutron:network_name"=neutron-64903804-9685-4fda-8014-03b9fa439923, "neutron:port_capabilities"="", "neutron:port_name"="", "neutron:project_id"=d26351fd860647b48930e3be286ff825, "neutron:revision_number"="25", "neutron:security_group_ids"="5ce5434f-2798-4c24-b35e-adcc3e6c4821", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal}
ha_chassis_group : []
mirror_rules : []
name : "b0d84030-d4b9-48d3-8915-29f518b52bb9"
options : {}
parent_name : []
port_security : ["fa:16:3e:a9:a6:75 2001:db8:1200::148 198.51.100.220"]
tag : []
tag_request : []
type : ""
up : true
```

Revision history for this message
Michel Nederlof (mnederlof) wrote :

then, during live migration this field changes:

options : {activation-strategy=rarp, requested-chassis="compute1,compute2"}

and afterwards, the following fields have updated:

external_ids : {"neutron:cidrs"="2001:db8:1200::148/64 198.51.100.220/22", "neutron:device_id"="6a6db2eb-bcc6-4737-9e33-396780e17b96", "neutron:device_owner"="compute:DC1", "neutron:host_id"=compute2, "neutron:network_name"=neutron-64903804-9685-4fda-8014-03b9fa439923, "neutron:port_capabilities"="", "neutron:port_name"="", "neutron:project_id"=d26351fd860647b48930e3be286ff825, "neutron:revision_number"="34", "neutron:security_group_ids"="5ce5434f-2798-4c24-b35e-adcc3e6c4821", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=normal}
options : {requested-chassis=compute2}

Changed in ovn-bgp-agent:
status: New → In Progress
Changed in ovn-bgp-agent:
importance: Undecided → Medium
Revision history for this message
Michel Nederlof (mnederlof) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-bgp-agent (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/906111
Committed: https://opendev.org/openstack/ovn-bgp-agent/commit/6e0d5766501ad28dabc84941f84a2edbc976d0c8
Submitter: "Zuul (22348)"
Branch: master

commit 6e0d5766501ad28dabc84941f84a2edbc976d0c8
Author: Michel Nederlof <email address hidden>
Date: Fri Jan 19 14:51:24 2024 +0100

    Fix event handling for LSP and prefer the options.requested-chassis info

    Since the requested-chassis superseeds the placement in external_ids
    (which is managed by neutron), we should preferable use that instead of the
    value set by neutron (which _could_ lag or be wrong in specific scenario's)

    Also update logic for FIP handling to make migrations more efficient.

    Closes-Bug: #2049902

    Change-Id: I7f73a1ba7956f22e58fdde383775e88bf72cba14

Changed in ovn-bgp-agent:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-bgp-agent (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/910305

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-bgp-agent (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/910305
Committed: https://opendev.org/openstack/ovn-bgp-agent/commit/16cfb301d1de00262067d73ee606401134aea2e9
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 16cfb301d1de00262067d73ee606401134aea2e9
Author: Michel Nederlof <email address hidden>
Date: Fri Jan 19 14:51:24 2024 +0100

    Fix event handling for LSP and prefer the options.requested-chassis info

    Since the requested-chassis superseeds the placement in external_ids
    (which is managed by neutron), we should preferable use that instead of the
    value set by neutron (which _could_ lag or be wrong in specific scenario's)

    Also update logic for FIP handling to make migrations more efficient.

    Closes-Bug: #2049902

    Change-Id: I7f73a1ba7956f22e58fdde383775e88bf72cba14
    (cherry picked from commit 6e0d5766501ad28dabc84941f84a2edbc976d0c8)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-bgp-agent 2.0.0.0rc1

This issue was fixed in the openstack/ovn-bgp-agent 2.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ovn-bgp-agent (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913654

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ovn-bgp-agent (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913654
Committed: https://opendev.org/openstack/ovn-bgp-agent/commit/1bacff1dff825d267c390752fa21592c2e6f6588
Submitter: "Zuul (22348)"
Branch: master

commit 1bacff1dff825d267c390752fa21592c2e6f6588
Author: Michel Nederlof <email address hidden>
Date: Tue Mar 19 09:25:14 2024 +0000

    Fix placement of lsp when external_ids not in sync

    When options.requested-chassis is not in sync with
    external_ids.neutron:host_id it would pick both hosts, causing duplicate
    announcements from more than 1 host.

    This has been fixed in change 910305, but was left unchanged for the
    sync method, causing issues when the sync interval was re-evaluating all
    lsp's on the node.

    The code for determining the chassis of a port has been moved from the
    base_watcher to driver_utils so the logic for the event is the same as the
    logic when fetching the records from the northbound database.

    Related-bug: #2049902
    Change-Id: I545d6b41fd308eb56e5295657260718dc14868f7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ovn-bgp-agent (stable/2024.1)

Related fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913508

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ovn-bgp-agent (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913509

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ovn-bgp-agent (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913509
Committed: https://opendev.org/openstack/ovn-bgp-agent/commit/c9b50da5badea46281dad6b7aaf61c41394cf422
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit c9b50da5badea46281dad6b7aaf61c41394cf422
Author: Michel Nederlof <email address hidden>
Date: Tue Mar 19 09:25:14 2024 +0000

    Fix placement of lsp when external_ids not in sync

    When options.requested-chassis is not in sync with
    external_ids.neutron:host_id it would pick both hosts, causing duplicate
    announcements from more than 1 host.

    This has been fixed in change 910305, but was left unchanged for the
    sync method, causing issues when the sync interval was re-evaluating all
    lsp's on the node.

    The code for determining the chassis of a port has been moved from the
    base_watcher to driver_utils so the logic for the event is the same as the
    logic when fetching the records from the northbound database.

    Related-bug: #2049902
    Change-Id: I545d6b41fd308eb56e5295657260718dc14868f7
    (cherry picked from commit 1bacff1dff825d267c390752fa21592c2e6f6588)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ovn-bgp-agent (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/ovn-bgp-agent/+/913508
Committed: https://opendev.org/openstack/ovn-bgp-agent/commit/047d261cb7ea8f87cf33fdbce5a245106fcca335
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 047d261cb7ea8f87cf33fdbce5a245106fcca335
Author: Michel Nederlof <email address hidden>
Date: Tue Mar 19 09:25:14 2024 +0000

    Fix placement of lsp when external_ids not in sync

    When options.requested-chassis is not in sync with
    external_ids.neutron:host_id it would pick both hosts, causing duplicate
    announcements from more than 1 host.

    This has been fixed in change 910305, but was left unchanged for the
    sync method, causing issues when the sync interval was re-evaluating all
    lsp's on the node.

    The code for determining the chassis of a port has been moved from the
    base_watcher to driver_utils so the logic for the event is the same as the
    logic when fetching the records from the northbound database.

    Related-bug: #2049902
    Change-Id: I545d6b41fd308eb56e5295657260718dc14868f7
    (cherry picked from commit 1bacff1dff825d267c390752fa21592c2e6f6588)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.