ovn-octavia-provider: incorrect router association in NB when network is linked to more than 1 router

Bug #1949059 reported by Gabriel Barazer
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Slawek Kaplonski

Bug Description

LB does not work when creating an Octavia LB with OVN provider in a network containing multiple router interfaces.

When creating a loadbalancer using ovn provider, the OVN octavia driver agent creates a Load_Balancer entry and add several external ids, one of those being the Logical Router ID (lr_ref) to have the OVN LB properly work with an associated floating IP. This lr_ref association is repeated multiple times during the lifecycle of a LB : at create time (helper.py:lb_create) or when adding a listener,pool or pool member.
During LB creation, the driver agent tries to find the associated LR of every LS linked with the LB with helper.py:_find_lr_of_ls to add the LR IDs to the lr_ref external id field in OVN. The linked LS are the vip_network or the LS in which the pool members are.

The problem is the function _find_lr_of_ls incorrectly assumes that there is only one router port, by returning the LR associated with the FIRST port of type 'router' in the whole LS. This causes an unexpected behavior as the first port can be either the router port of the default gateway router, but it can also be any other router port having an interface in that LS.

The use case of having multiple router interfaces in one LS in to be able to route between other internal networks without NAT, using static routes installed in the neutron default router.
This is a standard way of routing between internal networks in neutron and it works perfectly with OVN.
For a quick example, say we have a LS "ls1" containing the 192.168.1.0/24 network attached to its "lr1" router. We want all the traffic to go to the default gateway (192.168.1.1), SNATted by going through the public interface of the router lr1. For traffic going to 192.168.2.0/24, we want it to go to router "lr2". For this to happen, we add an interface from lr2 (192.168.1.2) in the ls1 network, and add a static route in lr1 (route: 192.168.2.0/24 next-hop 192.168.1.2). The network ls1 now has two routers attached.
We cannot do the same by interconnecting directly multiple routers (neutron does not supports router-to-router links), and we cannot use a dedicated interconnect neutron network to do that, because all the flows routed by lr1 to lr2 through a dedicated network cannot be then SNATted by lr2. But this behavior is out of the present topic, it's just a way of demonstrating that there is no workaround to route traffic the way neutron allows us to without having two routers having both an interface in the initial LS.

The correct logic for helper.py:_find_lr_of_ls would be to first gather all ports with type=router, then check if the port IP matches the default gateway specified in neutron subnet definition. If it matches, then return the associated LR. Another way to make sure that the LB can work through all the attached routers in a LS would be to add all the lr_refs of all routers found in a LS (and have an event listening for all router ports added to the LS after LB creation).

Tested on latest Xena build.

Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
tags: added: ovn
Revision history for this message
Brian Haley (brian-haley) wrote :

Looks like a bug and the correct fix. The only problem I see is that the ovn-octavia-provider isn't currently being maintained (sorry, I'm working on other things). Don't know if you're using a distro where you can also file a bug to get some attention on this. Good luck!

Revision history for this message
Gabriel Barazer (gabriel-h) wrote :

Hi,

You got me worried, as I intend to use the OVN provider driver in production. What do you mean by "isn't currently being maintained" ? Is it a risk that the project could die or be removed from OpenStack ? Or is this situation temporary ?

Regarding the bug, I can probably try and fix it, but I'm pretty new regarding bug submission and the whole OpenStack governance for publishing patches. Do I simply attach the patch here ? Is there a way I can send a pull request ? Do I need to do some paperwork ?

Thanks!

Revision history for this message
Brian Haley (brian-haley) wrote :

> You got me worried, as I intend to use the OVN provider driver in production. What do you mean by > "isn't currently being maintained" ? Is it a risk that the project could die or be removed from
> OpenStack ? Or is this situation temporary ?

I can only answer the first question - there has not been a change since August 27th when I stepped back from the project, no one seems to have stepped forward, it's sad.

For submitting patches:

https://www.openstack.org/blog/submit-your-first-openstack-patch-in-three-steps/

I think it's easier than a pull request.

Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (master)
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/816868
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/ee800e165469dec8c492893c5ba64d4be7cc3170
Submitter: "Zuul (22348)"
Branch: master

commit ee800e165469dec8c492893c5ba64d4be7cc3170
Author: Slawek Kaplonski <email address hidden>
Date: Fri Nov 5 18:48:34 2021 +0100

    Check gateway IP while looking for LR plugged to LS

    When LB or member is created, driver looks for the Logical Router which
    is plugged to the Logical Switch (Neutron network). As there can be more
    than one router connected to one network, we should always store in
    Loadbalancer's external_ids ID of the router which is used as default
    gateway for the VIP's subnet.
    This patch changes _find_lr_of_ls() method of the
    ovn_octavia_provider.helper.OvnProviderHelper class so it checks
    subnet's gateway IP while looks for the logical router.

    Closes-bug: #1949059
    Change-Id: I771f39ba190623857af208e23d8228c4bdc0db20

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/xena)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/826032

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/825947

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/825947
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/aecd858ddc9742538e7b033da5634311bd6248bd
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit aecd858ddc9742538e7b033da5634311bd6248bd
Author: Slawek Kaplonski <email address hidden>
Date: Fri Nov 5 18:48:34 2021 +0100

    Check gateway IP while looking for LR plugged to LS

    When LB or member is created, driver looks for the Logical Router which
    is plugged to the Logical Switch (Neutron network). As there can be more
    than one router connected to one network, we should always store in
    Loadbalancer's external_ids ID of the router which is used as default
    gateway for the VIP's subnet.
    This patch changes _find_lr_of_ls() method of the
    ovn_octavia_provider.helper.OvnProviderHelper class so it checks
    subnet's gateway IP while looks for the logical router.

    Conflicts:
        ovn_octavia_provider/tests/unit/test_helper.py

    Closes-bug: #1949059
    Change-Id: I771f39ba190623857af208e23d8228c4bdc0db20
    (cherry picked from commit ee800e165469dec8c492893c5ba64d4be7cc3170)
    (cherry picked from commit 63a6a212a55f27981aba2bab06be00f7560e8bcf)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/826032
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/a34849182fffd73e985d68d880d36a1bb78fa150
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit a34849182fffd73e985d68d880d36a1bb78fa150
Author: Slawek Kaplonski <email address hidden>
Date: Fri Nov 5 18:48:34 2021 +0100

    Check gateway IP while looking for LR plugged to LS

    When LB or member is created, driver looks for the Logical Router which
    is plugged to the Logical Switch (Neutron network). As there can be more
    than one router connected to one network, we should always store in
    Loadbalancer's external_ids ID of the router which is used as default
    gateway for the VIP's subnet.
    This patch changes _find_lr_of_ls() method of the
    ovn_octavia_provider.helper.OvnProviderHelper class so it checks
    subnet's gateway IP while looks for the logical router.

    Conflicts:
        ovn_octavia_provider/tests/unit/test_helper.py

    Closes-bug: #1949059
    Change-Id: I771f39ba190623857af208e23d8228c4bdc0db20
    (cherry picked from commit ee800e165469dec8c492893c5ba64d4be7cc3170)
    (cherry picked from commit 63a6a212a55f27981aba2bab06be00f7560e8bcf)

tags: added: in-stable-wallaby
tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/825946
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/63a6a212a55f27981aba2bab06be00f7560e8bcf
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 63a6a212a55f27981aba2bab06be00f7560e8bcf
Author: Slawek Kaplonski <email address hidden>
Date: Fri Nov 5 18:48:34 2021 +0100

    Check gateway IP while looking for LR plugged to LS

    When LB or member is created, driver looks for the Logical Router which
    is plugged to the Logical Switch (Neutron network). As there can be more
    than one router connected to one network, we should always store in
    Loadbalancer's external_ids ID of the router which is used as default
    gateway for the VIP's subnet.
    This patch changes _find_lr_of_ls() method of the
    ovn_octavia_provider.helper.OvnProviderHelper class so it checks
    subnet's gateway IP while looks for the logical router.

    Closes-bug: #1949059
    Change-Id: I771f39ba190623857af208e23d8228c4bdc0db20
    (cherry picked from commit ee800e165469dec8c492893c5ba64d4be7cc3170)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829536

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829575

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ovn-octavia-provider (stable/ussuri)

Change abandoned by "Luis Tomas Bolivar <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829536
Reason: abandon in favor of https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829575, which has the correct change-id

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829575
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/2e47080c4d51632213ba79fa589a354c9ee04af2
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 2e47080c4d51632213ba79fa589a354c9ee04af2
Author: Luis Tomas Bolivar <email address hidden>
Date: Wed Feb 16 13:59:16 2022 +0100

    Check gateway IP while looking for LR plugged to LS

    When LB or member is created, driver looks for the Logical Router which
    is plugged to the Logical Switch (Neutron network). As there can be more
    than one router connected to one network, we should always store in
    Loadbalancer's external_ids ID of the router which is used as default
    gateway for the VIP's subnet.
    This patch changes _find_lr_of_ls() method of the
    ovn_octavia_provider.helper.OvnProviderHelper class so it checks
    subnet's gateway IP while looks for the logical router.

    Closes-bug: #1949059
    (manually cherry picked from commit ee800e165469dec8c492893c5ba64d4be7cc3170)

    Change-Id: I771f39ba190623857af208e23d8228c4bdc0db20

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 2.0.0.0rc1

This issue was fixed in the openstack/ovn-octavia-provider 2.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.0.1

This issue was fixed in the openstack/ovn-octavia-provider 1.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn train-eol

This issue was fixed in the openstack/networking-ovn train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.2.0

This issue was fixed in the openstack/ovn-octavia-provider 1.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider ussuri-eol

This issue was fixed in the openstack/ovn-octavia-provider ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider victoria-eom

This issue was fixed in the openstack/ovn-octavia-provider victoria-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.