Related dvr routers aren't created on compute nodes

Bug #1884527 reported by Slawek Kaplonski on 2020-06-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Medium
Slawek Kaplonski

Bug Description

We observed in our d/s CI that scenario test test_connectivity_through_2_routers is failing for us in topology with 3 controllers and 2 compute nodes.
The reason why it was failing is that related routers wasn't created on compute nodes properly. Only routers to which VMs on host were connected were created.

After investigation I found out that this regression was caused by patch https://github.com/openstack/neutron/commit/48ea7da6c52ee14f0e9cc244fbc834283a8e74a7 because in some cases when "related router" is updated on L3 agent, it calls get_routers() in https://github.com/openstack/neutron/blob/390c4ac55f3ea883882412afdc1b921c4c3614e1/neutron/agent/l3/agent.py#L702 and that method on server side is looking if requested router is scheduled to the L3 agent or not and is also looking for routers related to the routers on the host. But isn't looking for routers related to the requested one as it is already "related" router.
So when L3 agent is already processing related router and will ask server about details of this router, it will not get it as this router isn't scheduled to that compute node (it's only related to other dvr router scheduled to the host).

Fix proposed to branch: master
Review: https://review.opendev.org/737286

Changed in neutron:
status: Confirmed → In Progress

Reviewed: https://review.opendev.org/737286
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6
Submitter: Zuul
Branch: master

commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527

Changed in neutron:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/740462
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=afb359f3719a99adff25b4c13e697fb548922456
Submitter: Zuul
Branch: stable/rocky

commit afb359f3719a99adff25b4c13e697fb548922456
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527
    (cherry picked from commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6)

tags: added: in-stable-rocky

Reviewed: https://review.opendev.org/740463
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=800be7b518e6933b654f820381cd73f8daa8945e
Submitter: Zuul
Branch: stable/queens

commit 800be7b518e6933b654f820381cd73f8daa8945e
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527
    (cherry picked from commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6)

tags: added: in-stable-queens

Reviewed: https://review.opendev.org/740461
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=01f4e09753b69688a953a13b56cbe15b1b2d16f9
Submitter: Zuul
Branch: stable/stein

commit 01f4e09753b69688a953a13b56cbe15b1b2d16f9
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527
    (cherry picked from commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6)

tags: added: in-stable-stein

Reviewed: https://review.opendev.org/740459
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=282710016968b353d10b6a986f48d7f8a7d0404a
Submitter: Zuul
Branch: stable/ussuri

commit 282710016968b353d10b6a986f48d7f8a7d0404a
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527
    (cherry picked from commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6)

tags: added: in-stable-ussuri

Reviewed: https://review.opendev.org/740460
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3ae7b0f189d58e45ba9b2ec72f148e230d632ab3
Submitter: Zuul
Branch: stable/train

commit 3ae7b0f189d58e45ba9b2ec72f148e230d632ab3
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jun 22 17:08:15 2020 +0200

    [DVR] Related routers should be included if are requested

    In case when related dvr router is configured by L3 agent, it is first
    added to the tasks queue and then processed as any other router hosted
    on the L3 agent.
    But if L3 agent will ask neutron server about details of such router,
    it wasn't returned back as this router wasn't really scheduled to the
    compute node which was asking for it. It was "only" related to some
    other router scheduled to this compute node. Because of that router's
    info wasn't found in reply from the neutron-server and L3 agent was
    removing it from the compute node.

    Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
    will check router serviceable ports for each dvr router hosted on the
    compute node and will then find all routers related to it. Thanks to
    that it will return routers which are on the compute node only because
    of other related routers scheduled to this host and such router will not
    be deleted anymore.

    Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
    Closes-Bug: #1884527
    (cherry picked from commit 38286dbd2e35f1f94bf06b5f1cfa9227e5d402c6)

tags: added: in-stable-train
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers