[OVN] Router Availability Zones doesn't work with segmented networks

Bug #1939144 reported by Lucas Alvares Gomes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Lucas Alvares Gomes

Bug Description

Hi,

Looking at the external networks from the edge environment I see that
these fields are None:
| provider:network_type | None |
| provider:physical_network | None |

Instead we have this:

| segments | [{'provider:network_type': 'flat',
'provider:physical_network': 'leaf0', 'provider:segmentation_id':
None}, {'provider:network_type': 'flat', 'provider:physical_network':
'leaf1', 'provider:segmentation_id': None}, {'provider:network_type':
'flat', 'provider:physical_network': 'leaf2',
'provider:segmentation_id': None}] |

When building a list of candidates nodes to scheduler the gateway
router ports to, the ML2/OVN driver tries to check if there's a
physical network on the nodes, see [0][1]. And in order to do that it
uses the "provider:network_type" and "provider:physical_network"
fields (see [1]).

So the physnet attribute is now None (see [0]) and when it gets to the
get_candidates_for_scheduling() method [2] the list of candidates will
be empty because no gateway node matched this physnet. Also it is in
this method that we filter the candidates based on the AZs.

Now, the reason why it does not fail and the gw port still get
scheduled to any other gw node is because once it gets to the
scheduler code if the list candidates is empty it will then just fetch
a list of gw chassis without any consideration [3] regarding the
physnets and use it as candidates.

As you can see the code is messy and a future refactor may be needed.
For this problem specifically I would recommend doing a simpler fix where
get_candidates_for_scheduling() would consider all GW chassis independent
of the physnet in case it's None and then filter these Chassis based on
their AZ. That would be a simpler fix that is backportable.

[0] https://github.com/openstack/neutron/blob/b7befc98118c270877b42e94f9cb6f7ccad0b072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1370
[1] https://github.com/openstack/neutron/blob/b7befc98118c270877b42e94f9cb6f7ccad0b072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1314-L1317
[2] https://github.com/openstack/neutron/blob/b7befc98118c270877b42e94f9cb6f7ccad0b072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1291-L1296
[3] https://github.com/openstack/neutron/blob/b7befc98118c270877b42e94f9cb6f7ccad0b072/neutron/scheduler/l3_ovn_scheduler.py#L62

Changed in neutron:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
tags: added: ovn
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/803759

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/803920

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/803921

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/803922

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/803759
Committed: https://opendev.org/openstack/neutron/commit/8ac9e2fe6d37ba4608cf141c78017524a5cae1de
Submitter: "Zuul (22348)"
Branch: master

commit 8ac9e2fe6d37ba4608cf141c78017524a5cae1de
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri Aug 6 13:32:08 2021 +0100

    [OVN] Fix Router Availability Zones for segmented networks

    This patch changes the get_candidates_for_scheduling() method to also
    consider all gateway chassis as potential candidates (limited by
    Availability Zones) in case physnet parameter is empty (as for the
    segmented networks case).

    This patch is a simpler/backportable fix for the segmented networks +
    Router AZs use case. In the future we should consider refactoring the
    code responsible for scheduling the gateway router ports, a more detailed
    explanation of what is happening/needed can be found at LP #1939144.

    Change-Id: I8dc5336c6e2acd0b0a2cad0e80eee91280b9f945
    Closes-Bug: #1939144
    Signed-off-by: Lucas Alvares Gomes <email address hidden>

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/803920
Committed: https://opendev.org/openstack/neutron/commit/adabfc8674466971d2b367474869e86dd1f32980
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit adabfc8674466971d2b367474869e86dd1f32980
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri Aug 6 13:32:08 2021 +0100

    [OVN] Fix Router Availability Zones for segmented networks

    This patch changes the get_candidates_for_scheduling() method to also
    consider all gateway chassis as potential candidates (limited by
    Availability Zones) in case physnet parameter is empty (as for the
    segmented networks case).

    This patch is a simpler/backportable fix for the segmented networks +
    Router AZs use case. In the future we should consider refactoring the
    code responsible for scheduling the gateway router ports, a more detailed
    explanation of what is happening/needed can be found at LP #1939144.

    Change-Id: I8dc5336c6e2acd0b0a2cad0e80eee91280b9f945
    Closes-Bug: #1939144
    Signed-off-by: Lucas Alvares Gomes <email address hidden>
    (cherry picked from commit 8ac9e2fe6d37ba4608cf141c78017524a5cae1de)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/803921
Committed: https://opendev.org/openstack/neutron/commit/539bfdff2f2232cf89044bc0392ff263be815a42
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 539bfdff2f2232cf89044bc0392ff263be815a42
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri Aug 6 13:32:08 2021 +0100

    [OVN] Fix Router Availability Zones for segmented networks

    This patch changes the get_candidates_for_scheduling() method to also
    consider all gateway chassis as potential candidates (limited by
    Availability Zones) in case physnet parameter is empty (as for the
    segmented networks case).

    This patch is a simpler/backportable fix for the segmented networks +
    Router AZs use case. In the future we should consider refactoring the
    code responsible for scheduling the gateway router ports, a more detailed
    explanation of what is happening/needed can be found at LP #1939144.

    Change-Id: I8dc5336c6e2acd0b0a2cad0e80eee91280b9f945
    Closes-Bug: #1939144
    Signed-off-by: Lucas Alvares Gomes <email address hidden>
    (cherry picked from commit 8ac9e2fe6d37ba4608cf141c78017524a5cae1de)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/803922
Committed: https://opendev.org/openstack/neutron/commit/715ecb1a675b5271b1d4939fcfc7a85e0bf5f1e9
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 715ecb1a675b5271b1d4939fcfc7a85e0bf5f1e9
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri Aug 6 13:32:08 2021 +0100

    [OVN] Fix Router Availability Zones for segmented networks

    This patch changes the get_candidates_for_scheduling() method to also
    consider all gateway chassis as potential candidates (limited by
    Availability Zones) in case physnet parameter is empty (as for the
    segmented networks case).

    This patch is a simpler/backportable fix for the segmented networks +
    Router AZs use case. In the future we should consider refactoring the
    code responsible for scheduling the gateway router ports, a more detailed
    explanation of what is happening/needed can be found at LP #1939144.

    Change-Id: I8dc5336c6e2acd0b0a2cad0e80eee91280b9f945
    Closes-Bug: #1939144
    Signed-off-by: Lucas Alvares Gomes <email address hidden>
    (cherry picked from commit 8ac9e2fe6d37ba4608cf141c78017524a5cae1de)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.4.1

This issue was fixed in the openstack/neutron 16.4.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.2.1

This issue was fixed in the openstack/neutron 17.2.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.1.1

This issue was fixed in the openstack/neutron 18.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.0.0.0rc1

This issue was fixed in the openstack/neutron 19.0.0.0rc1 release candidate.

tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn train-eol

This issue was fixed in the openstack/networking-ovn train-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.