We shouldn't schedule routers on compute nodes for DVR environments

Bug #1743745 reported by Daniel Alvarez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
Fix Released
High
venkata anil

Bug Description

When we deploy a DVR environment where we want allow to have FIPs in compute nodes we'll need to configure bridge-mappings on them so that they have external connectivity.

Having bridge-mappings configured on compute nodes make them eligible on router scheduling so it may be possible that a router is scheduled on a compute node. In Neutron reference implementation we only do SNAT on controllers so we should follow the same approach in networking-ovn and only schedule routers on controller nodes.

Right now we have the following problem:

R1 scheduled on compute node C1, C2, C3 (L3HA).
VMs in compute nodes C4, C5, ..., CN using R1 for SNAT.

If during a maintenance operation, C1, C2 and C3 have to be rebooted, all the VMs running in C4, ..., CN can't get through SNAT (even though floating IP's and east-west traffic would still work).

As we don't yet have distributed SNAT, the quicker fix would be to work out a way to decide which nodes are eligible right now for scheduling a router and not using the bridge-mappings criteria anymore.

Revision history for this message
venkata anil (anil-venkata) wrote :

Want to address this issue with https://review.openstack.org/#/c/486098/

Changed in networking-ovn:
assignee: nobody → venkata anil (anil-venkata)
Changed in networking-ovn:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-ovn (master)

Fix proposed to branch: master
Review: https://review.openstack.org/537844

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on networking-ovn (master)

Change abandoned by venkata anil (<email address hidden>) on branch: master
Review: https://review.openstack.org/486098
Reason: Abandoned in favour of https://review.openstack.org/#/c/537844/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (master)

Reviewed: https://review.openstack.org/537844
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=d265e93698eee81c1cae0c357883a8c06562acab
Submitter: Zuul
Branch: master

commit d265e93698eee81c1cae0c357883a8c06562acab
Author: venkata anil <email address hidden>
Date: Thu Jan 25 10:55:51 2018 +0000

    Schedule gateway on chassis from ovn-cms-options

    Admin sets ovn-cms-options in external_ids as

    ovs-vsctl set open .
       external_ids:ovn-cms-options="enable-chassis-as-gw"

    to enable a chassis as a candidate for scheduling gateway router.
    Networking-ovn will parse ovn-cms-options and select this chassis
    if it has proper bridge mappings.

    This helps admin to exclude compute nodes to host gateway routers as
    they are more likely to be restarted for maintenance operations.

    We follow this order for selecting candidates
    1) candidates with ovn-cms-options and proper bridge mappings
    2) if no candidates from 1), then chassis with proper
       bridge mappings

    Closes-bug: #1743745
    Change-Id: I86fbe27f0b6a9317ad82c2bcf2a0446d118de35b

Changed in networking-ovn:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-ovn (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/556377

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (stable/queens)

Reviewed: https://review.openstack.org/556377
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=e9b82bd594ee1fa41dcfc8b0ca673befd740e9f0
Submitter: Zuul
Branch: stable/queens

commit e9b82bd594ee1fa41dcfc8b0ca673befd740e9f0
Author: venkata anil <email address hidden>
Date: Thu Jan 25 10:55:51 2018 +0000

    Schedule gateway on chassis from ovn-cms-options

    Admin sets ovn-cms-options in external_ids as

    ovs-vsctl set open .
       external_ids:ovn-cms-options="enable-chassis-as-gw"

    to enable a chassis as a candidate for scheduling gateway router.
    Networking-ovn will parse ovn-cms-options and select this chassis
    if it has proper bridge mappings.

    This helps admin to exclude compute nodes to host gateway routers as
    they are more likely to be restarted for maintenance operations.

    We follow this order for selecting candidates
    1) candidates with ovn-cms-options and proper bridge mappings
    2) if no candidates from 1), then chassis with proper
       bridge mappings

    Conflicts:
     doc/source/admin/refarch/refarch.rst

    Closes-bug: #1743745
    Change-Id: I86fbe27f0b6a9317ad82c2bcf2a0446d118de35b

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 5.0.0.0b1

This issue was fixed in the openstack/networking-ovn 5.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 4.0.1

This issue was fixed in the openstack/networking-ovn 4.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 4.0.2

This issue was fixed in the openstack/networking-ovn 4.0.2 release.

Changed in networking-ovn:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 4.0.3

This issue was fixed in the openstack/networking-ovn 4.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.