Automatic rescheduling of BGP speakers on DrAgents

Bug #1920065 reported by Renat Nurgaliyev
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Wishlist
Renat Nurgaliyev

Bug Description

In case when dynamic routing agent becomes unreachable, neutron takes these actions:

1. Remove all BGP speakers from unreachable agents
2. Schedule all unassigned BGP speakers on available DrAgents

This behavior can be undesirable, in the following cases:

1. Speakers are removed from DrAgent, even if there is no other
alive agent running. Sometimes, I'd prefer them to stay configured
exactly where they are, and come back after DrAgent is back online,
after the server is restarted or so. This sometimes leads to situations,
especially when there is only one active DrAgent, that speakers are
not configured on any DrAgent at all.

2. Sometimes it is desirable to let operator control which components
are running where. For example, not every node running DrAgent has
reachability to all iBGP peers, and network designer places route
reflectors, DrAgents, BGP speakers, in their appropriate places, keeping
in mind high availability and other concerns. In these setups, it could
be better to let the speaker fail on DrAgent which is down. Moving speaker
to another DrAgent also means that the source IP address for the BGP
session will also change, which sometimes can be not so good to reconfigure
on the other side of BGP peering, and not predictable at all.

These situations may happen after following change was introduced:
https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

My proposal is to add a configuration flag to control this behavior:
https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/780675

Revision history for this message
Brian Haley (brian-haley) wrote :

Assigned to Renat since the review is already started.

I pinged Slawek about needing RFE review of this as well, seems simple enough to me.

Changed in neutron:
importance: Undecided → Wishlist
tags: added: l3-bgp
Changed in neutron:
assignee: nobody → Renat Nurgaliyev (rnurgaliyev)
status: New → Triaged
tags: added: rfe
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Thx for reporting that RFE. Lets discuss it in the next drivers meeting which will be on Friday 26.03.2021: http://eavesdrop.openstack.org/#Neutron_drivers_Meeting

tags: added: rfe-triaged
removed: rfe
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

On the last drivers meeting we decided to approve that RFE. Thx for the proposal.

tags: added: rfe-approved
removed: rfe-triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-dynamic-routing (master)
Changed in neutron:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron-dynamic-routing (master)

Reviewed: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/815265
Committed: https://opendev.org/openstack/neutron-dynamic-routing/commit/8a0ddf6051c81b982187bb062b194284398f5703
Submitter: "Zuul (22348)"
Branch: master

commit 8a0ddf6051c81b982187bb062b194284398f5703
Author: Dr. Jens Harbott <email address hidden>
Date: Mon Oct 25 11:50:24 2021 +0200

    Add a StaticScheduler without automatic scheduling

    The automatic scheduling that was introduced in [0] is having some
    issues. Add a StaticScheduler that can be used as an alternative for
    deployments that want explicit control over where their BGP speakers are
    getting scheduled.

    Add a job that runs with the new scheduler.

    [0] https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

    Depends-On: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/815294
    Closes-Bug: 1920065
    Signed-off-by: Dr. Jens Harbott <email address hidden>
    Change-Id: Ib7fcd0c7371bc75089b10024ee1b6e75c98f0188

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-dynamic-routing (stable/xena)
Revision history for this message
Renat Nurgaliyev (rnurgaliyev) wrote :

I confirm that fix proposed in https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/820099 fixed the initial issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron-dynamic-routing (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/820099
Committed: https://opendev.org/openstack/neutron-dynamic-routing/commit/3d5eddd70bccc6fad048afccb35099a1648f804c
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 3d5eddd70bccc6fad048afccb35099a1648f804c
Author: Dr. Jens Harbott <email address hidden>
Date: Mon Oct 25 11:50:24 2021 +0200

    Add a StaticScheduler without automatic scheduling

    The automatic scheduling that was introduced in [0] is having some
    issues. Add a StaticScheduler that can be used as an alternative for
    deployments that want explicit control over where their BGP speakers are
    getting scheduled.

    Add a job that runs with the new scheduler.

    [0] https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

    Depends-On: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/815294
    Closes-Bug: 1920065
    Signed-off-by: Dr. Jens Harbott <email address hidden>
    Change-Id: Ib7fcd0c7371bc75089b10024ee1b6e75c98f0188
    (cherry picked from commit 8a0ddf6051c81b982187bb062b194284398f5703)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-dynamic-routing (stable/wallaby)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron-dynamic-routing (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/820680
Committed: https://opendev.org/openstack/neutron-dynamic-routing/commit/c5c86123754a44c20deda5e9c3de2ef4e864b7af
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit c5c86123754a44c20deda5e9c3de2ef4e864b7af
Author: Dr. Jens Harbott <email address hidden>
Date: Mon Oct 25 11:50:24 2021 +0200

    Add a StaticScheduler without automatic scheduling

    The automatic scheduling that was introduced in [0] is having some
    issues. Add a StaticScheduler that can be used as an alternative for
    deployments that want explicit control over where their BGP speakers are
    getting scheduled.

    Add a job that runs with the new scheduler.

    [0] https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/478455

    Depends-On: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/815294
    Closes-Bug: 1920065
    Signed-off-by: Dr. Jens Harbott <email address hidden>
    Change-Id: Ib7fcd0c7371bc75089b10024ee1b6e75c98f0188
    (cherry picked from commit 8a0ddf6051c81b982187bb062b194284398f5703)
    (cherry picked from commit 3d5eddd70bccc6fad048afccb35099a1648f804c)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron-dynamic-routing 18.1.0

This issue was fixed in the openstack/neutron-dynamic-routing 18.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron-dynamic-routing 19.1.0

This issue was fixed in the openstack/neutron-dynamic-routing 19.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron-dynamic-routing 20.0.0.0rc1

This issue was fixed in the openstack/neutron-dynamic-routing 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.