[OVN] L3 port scheduler, defaults to use all chassis; stop this behaviour

Bug #2019217 reported by Rodolfo Alonso
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Rodolfo Alonso

Bug Description

The OVN L3 port scheduler assigns the chassis for the router ports. It retrieves the chassis list that are configured as gateway nodes (external_ids:ovn-cms-options=enable-chassis-as-gw). This list could be also filtered by availability zones; the scheduler will filter out those chassis without the requested AZ.

In case of not returning any chassis ("candidates") [1], the scheduler assumes that "any hypervisor/chassis can host a router gateway port" (from the patch [2] introducing this functionality).

The scope of this patch is to revert this default behaviour. If there are no candidates available, the L3 scheduler will fail and report the issue. The OVN router port won't be scheduled in any chassis. This error will be written in the logs.

[1]https://opendev.org/openstack/neutron/src/commit/47d4ec4e99d5aae62656c88206eb6a77f70d4a8b/neutron/scheduler/l3_ovn_scheduler.py#L63
[2]https://review.opendev.org/c/openstack/networking-ovn/+/332434

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance: Undecided → Low
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/884323

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/884323
Committed: https://opendev.org/openstack/neutron/commit/413044f253b6d434164e8a94dbeccec7b1b79ebe
Submitter: "Zuul (22348)"
Branch: master

commit 413044f253b6d434164e8a94dbeccec7b1b79ebe
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed May 24 07:15:35 2023 +0200

    [OVN] The L3 scheduler does not use all chassis by default

    Any OVN scheduler, inheriting from ``OVNGatewayScheduler``, calls
    ``_schedule_gateway`` to make the decision of in what chassis the
    router gatweway should be located. Before this patch, if the list
    of candidates was empty, the scheduler used all available chassis
    as candidate list. This patch is removing this default behaviour.
    In a deployment, only those chassis marked explicitly with
    "ovn-cms-options=enable-chassis-as-gw" can be used as gateway
    chassis.

    After enabling this patch, any existing router gateway port will
    preserve the assigned chassis; any new router gateway will be
    scheduled only on the chassis configured as gateways.

    If a router gateway port cannot find any chassis to be scheduled,
    the "neutron-ovn-invalid-chassis" will be used instead and a
    warning message will be printed in the logs.

    Closes-Bug: #2019217
    Change-Id: If0f843463edfd7edc5c897cc098de31444f9d81b

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.0.0.0b3

This issue was fixed in the openstack/neutron 23.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/908325

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/908325
Committed: https://opendev.org/openstack/neutron/commit/d55c591ecde2f6cc4c2cea64fb21a75b6343cd5a
Submitter: "Zuul (22348)"
Branch: master

commit d55c591ecde2f6cc4c2cea64fb21a75b6343cd5a
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 7 15:45:54 2024 +0000

    [OVN] A LRP in an external tunnelled network has no chassis

    A logical router port in an external tunnelled network won't be
    scheduled in any chassis. A tunnelled network has no physical
    provider network associated thus the logical router port cannot
    be bound to a specific chassis.

    Related-Bug: #2019217
    Change-Id: I140c22899ea3b0240f8c30902fc2dc7055914a18

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/909191

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/909305

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/909191
Committed: https://opendev.org/openstack/neutron/commit/a68369b65a2be2602abee6fc4ddb4d4a637089e7
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit a68369b65a2be2602abee6fc4ddb4d4a637089e7
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 7 15:45:54 2024 +0000

    [OVN] A LRP in an external tunnelled network has no chassis

    A logical router port in an external tunnelled network won't be
    scheduled in any chassis. A tunnelled network has no physical
    provider network associated thus the logical router port cannot
    be bound to a specific chassis.

    Related-Bug: #2019217
    Change-Id: I140c22899ea3b0240f8c30902fc2dc7055914a18
    (cherry picked from commit d55c591ecde2f6cc4c2cea64fb21a75b6343cd5a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/910588

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/910588
Committed: https://opendev.org/openstack/neutron/commit/b34911a6c938187c8554b39dfd0591d1cca94e73
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit b34911a6c938187c8554b39dfd0591d1cca94e73
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 7 15:45:54 2024 +0000

    [OVN] A LRP in an external tunnelled network has no chassis

    A logical router port in an external tunnelled network won't be
    scheduled in any chassis. A tunnelled network has no physical
    provider network associated thus the logical router port cannot
    be bound to a specific chassis.

    Conflicts:
        neutron/tests/functional/services/ovn_l3/test_plugin.py

    Related-Bug: #2019217
    Change-Id: I140c22899ea3b0240f8c30902fc2dc7055914a18
    (cherry picked from commit d55c591ecde2f6cc4c2cea64fb21a75b6343cd5a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/909305
Committed: https://opendev.org/openstack/neutron/commit/fa3223bb9d76c6f5ba1a12dd5a17cec6852be7df
Submitter: "Zuul (22348)"
Branch: master

commit fa3223bb9d76c6f5ba1a12dd5a17cec6852be7df
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Feb 16 19:24:28 2024 +0000

    [OVN] Remove OVN_GATEWAY_INVALID_CHASSIS artifact

    This artifact is no longer used in the "Logical_Router" registers (in
    the "options" field) to mark this "Logical_Router" as unhosted. A
    "Logical_Router" is considered as unhosted if the gateway
    "Logical_Router_Ports" have no "chassis" set.

    This artifact is also used to create a "Gateway_Chassis" register
    pointing to a inexisting invalid chassis called
    "neutron-ovn-invalid-chassis". Any "Logical_Router_Port" not bound
    to a chassis will have no value in "gateway_chassis" (NOTE1).

    NOTE1: this is valid now with the current two OVN L3 schedulers that
    use "gateway_chassis" to schedule the "Logical_Router_Port" of a
    router. In a future, we can consider using "ha_chassis_group" for
    scheduling.

    Partial-Bug: #2052821
    Related-Bug: #2019217
    Change-Id: I12717936fe2bc188545309bacb8a260981f14c88

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/913574

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/913574
Committed: https://opendev.org/openstack/neutron/commit/0a421118f2c724267c9e70b926b53136f55b575d
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 0a421118f2c724267c9e70b926b53136f55b575d
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed May 24 07:15:35 2023 +0200

    [OVN] The L3 scheduler does not use all chassis by default

    Any OVN scheduler, inheriting from ``OVNGatewayScheduler``, calls
    ``_schedule_gateway`` to make the decision of in what chassis the
    router gatweway should be located. Before this patch, if the list
    of candidates was empty, the scheduler used all available chassis
    as candidate list. This patch is removing this default behaviour.
    In a deployment, only those chassis marked explicitly with
    "ovn-cms-options=enable-chassis-as-gw" can be used as gateway
    chassis.

    After enabling this patch, any existing router gateway port will
    preserve the assigned chassis; any new router gateway will be
    scheduled only on the chassis configured as gateways.

    If a router gateway port cannot find any chassis to be scheduled,
    the "neutron-ovn-invalid-chassis" will be used instead and a
    warning message will be printed in the logs.

    Closes-Bug: #2019217
    Change-Id: If0f843463edfd7edc5c897cc098de31444f9d81b
    (cherry picked from commit 413044f253b6d434164e8a94dbeccec7b1b79ebe)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.2.0

This issue was fixed in the openstack/neutron 22.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.