[L3-HA] "max_l3_agents_per_router" not honored when the redundancy is reduced

Bug #2006496 reported by Rodolfo Alonso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Rodolfo Alonso

Bug Description

Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2166197

NOTE: Config option "router_distributed" must be False.

This issue is happening when initially we have a "max_l3_agents_per_router" number, we create a router and then we reduce the redundancy.

For example, if "max_l3_agents_per_router=3" and we create a HA router. Neutron will create 3 instances of this router and will create the corresponding "routerl3agentbindings" registers. E.g.:
MariaDB [ovs_neutron]> select * from routerl3agentbindings;
+--------------------------------------+--------------------------------------+---------------+
| router_id | l3_agent_id | binding_index |
+--------------------------------------+--------------------------------------+---------------+
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 5c54529e-1f4e-4332-9674-96a1d15a16b2 | 1 |
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 8112e03e-9191-495d-aea2-d2d7cd621767 | 2 |
| f8d3fec5-4648-48e9-b546-94f2e135df77 | 850beebc-3144-4673-a4cc-142162dba436 | 3 |
+--------------------------------------+--------------------------------------+---------------+

Now we reduce the redundancy to "max_l3_agents_per_router=1". If we remove the agent assignation for those registers with "binding_index" different to 1, the next time the router is updated, the L3 scheduler will create a new assignation with "binding_index=1". When the router is updated (a subnet is added or removed, a FIP is assigned or removed, etc), the scheduler is called. This method [1] will determine what is the next index that needs to be created (that is used both for the DHCP scheduler and the L3 scheduler).

In the given example, if the agents with "binding_index" different from 1 are removed, the vacant binding index method [1] will return 1:
  open_slots = sorted(list(all_indicies - set(binding_indices)))
  --> all_indicies = {1}
  --> binding_indices = {3} # for example
  --> open_slots = {1} # instead of an empty set(), as expected here.

[1]https://github.com/openstack/neutron/blob/7c3d6c414d3c0f085cae94b6f2186c4415a9298b/neutron/scheduler/base_scheduler.py#L102-L107

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/873107

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/873619

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/873622

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/873626

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/873627

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/873628

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/873629

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873107
Committed: https://opendev.org/openstack/neutron/commit/5250598c804a38c55ff78cfb457b73d1b3cd7e07
Submitter: "Zuul (22348)"
Branch: master

commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496
    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873619
Committed: https://opendev.org/openstack/neutron/commit/0920f17f476ce8b398deea3e54e9f90b5251cfc9
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)

tags: added: in-stable-zed
tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873622
Committed: https://opendev.org/openstack/neutron/commit/7dcf8be112ed205a6c694c1f3549e08b4234d82d
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 7dcf8be112ed205a6c694c1f3549e08b4234d82d
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py
        neutron/objects/l3agent.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)
    (cherry picked from commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873626
Committed: https://opendev.org/openstack/neutron/commit/5bbe4390410af36859a0d1c871d20a5777f1a134
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 5bbe4390410af36859a0d1c871d20a5777f1a134
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py
        neutron/objects/l3agent.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)
    (cherry picked from commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9)
    (cherry picked from commit 7dcf8be112ed205a6c694c1f3549e08b4234d82d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873627
Committed: https://opendev.org/openstack/neutron/commit/1cf679513471d411acbdafef1bcab1df6edd6108
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 1cf679513471d411acbdafef1bcab1df6edd6108
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py
        neutron/objects/l3agent.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)
    (cherry picked from commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9)
    (cherry picked from commit 7dcf8be112ed205a6c694c1f3549e08b4234d82d)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873628
Committed: https://opendev.org/openstack/neutron/commit/984503d4d083327ca5e525dfc420347128e508dd
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 984503d4d083327ca5e525dfc420347128e508dd
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py
        neutron/objects/l3agent.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)
    (cherry picked from commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9)
    (cherry picked from commit 7dcf8be112ed205a6c694c1f3549e08b4234d82d)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.0.0.0rc1

This issue was fixed in the openstack/neutron 22.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873629
Committed: https://opendev.org/openstack/neutron/commit/ad597572289906afd5fc6e74038581944790245a
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit ad597572289906afd5fc6e74038581944790245a
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Feb 8 13:14:19 2023 +0100

    Improve scheduling L3/DHCP agents, missing lower binding indexes

    This patch is covering an edge case that could happen when the number
    of DHCP agents ("dhcp_agents_per_network") or L3 agents
    ("max_l3_agents_per_router") has been reduced and there are more agents
    assigned than the current number. If the user removes any agent
    assignation from a L3 router or a DHCP agent, it is possible to remove
    first the lower binding assigned registers.

    Now the method ``get_vacant_binding_index`` calculates the number of
    agents bound and the number required. If a new one is needed, the
    method returns first the lower binding indexes not used.

    Closes-Bug: #2006496

    Conflicts:
        neutron/common/_constants.py
        neutron/objects/l3agent.py

    Change-Id: I25145c088ffdca47acfcb7add02b1a4a615e4612
    (cherry picked from commit 5250598c804a38c55ff78cfb457b73d1b3cd7e07)
    (cherry picked from commit 0920f17f476ce8b398deea3e54e9f90b5251cfc9)
    (cherry picked from commit 7dcf8be112ed205a6c694c1f3549e08b4234d82d)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.6.0

This issue was fixed in the openstack/neutron 19.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.3.0

This issue was fixed in the openstack/neutron 20.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.1.0

This issue was fixed in the openstack/neutron 21.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ussuri-eol

This issue was fixed in the openstack/neutron ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron victoria-eom

This issue was fixed in the openstack/neutron victoria-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom

This issue was fixed in the openstack/neutron wallaby-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.