[RFE] Distributed processing of OVSDB events in networking-ovn

Bug #1823715 reported by Lucas Alvares Gomes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
Fix Released
High
Lucas Alvares Gomes

Bug Description

Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1697281

In networking-ovn, the OVSDB Monitor is responsible for listening the events OVSDB and performing certain actions on them. We do use it extensively on various tasks including critical ones such as monitor port binding events (in order to notify Neutron/Nova that a port has been bound to a certain chassis).

Currently, the OVSDB Monitor class uses a distributed OVSDB lock to make sure that only one instance is going to handle those events at a time. The problem with this approach is that it creates a bottle neck because, even tho we could have many neutron workers running only 1 is handling those OVSDB events.

The bottleneck problem is highlighted even more when working with other technologies such as containers which relies on creating batch of ports and waiting them to be bound to chassis in a performant fashion.

For this RFE we need to think about a mechanism that would allow multiple events from OVSDB to be handled in a distributed fashion to improve the overall performance of those event actions.

Changed in networking-ovn:
importance: Undecided → High
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to networking-ovn (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/652040

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to networking-ovn (master)

Reviewed: https://review.openstack.org/652040
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=da94db44d97fa68bc637032a9f7a2d52fa34df28
Submitter: Zuul
Branch: master

commit da94db44d97fa68bc637032a9f7a2d52fa34df28
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri Apr 12 11:48:30 2019 +0100

    Design Doc: Distributed OVSDB events

    This patch introduces a design document highlighting the problem a
    proposing a solution for the way OVSDB events are currently handled in
    networking-ovn.

    Change-Id: I81df364ca1db078bffc42f2728888de4e6167601
    Related-Bug: #1823715
    Signed-off-by: Lucas Alvares Gomes <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-ovn (master)

Fix proposed to branch: master
Review: https://review.opendev.org/655407

Changed in networking-ovn:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/655408

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (master)

Reviewed: https://review.opendev.org/655407
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=3af381003e8ae78f56cae81a58dbc641c23485ee
Submitter: Zuul
Branch: master

commit 3af381003e8ae78f56cae81a58dbc641c23485ee
Author: Lucas Alvares Gomes <email address hidden>
Date: Tue Apr 9 11:58:42 2019 +0100

    Distributed OVSDB lock: HashRing common methods and DB migration

    This patch is responsible for creating the "ovn_hash_ring" database
    table and the common methods/classes to access it.

    Partial-Bug: #1823715
    Change-Id: I052791cda6264baf4497e1be2bf7d3d53c49fa60
    Signed-off-by: Lucas Alvares Gomes <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to networking-ovn (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/662484

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (master)

Reviewed: https://review.opendev.org/655408
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=6ef5489cf70cbc140a8768727350a8556fc8e870
Submitter: Zuul
Branch: master

commit 6ef5489cf70cbc140a8768727350a8556fc8e870
Author: Lucas Alvares Gomes <email address hidden>
Date: Wed Apr 10 13:19:34 2019 +0100

    Distributed OVSDB lock: Make use of the HashRing

    Changes:

    * New OvnIdlDistributedLock class added. This is now the base class
      which the NB and SB OVSDBs IDLs will inherit from. The old OvnIdl
      class was kept because services like the Metadata agent and Octavia
      driver uses it and for those services the OVSDB lock seems sufficient
      (but nothing prevents us from updating them too in the future)

    * A new pre_fork_initialize() hook was added to the mechanism driver.
      This hook runs before the process is forked (to create the workers)
      and it does two things:

      - Set a signal handler for SIGTERM. So, in case where a SIGTERM is
        sent the service will handle it and clean up the Hash Ring before
        exiting.

      - Clean up the Hash Ring at start up. If there's some leftover (in
        case of a SIGKILL for example) the workers will be considered part
        of the Hash Ring until they timeout. And in the start of the service
        we can get rid of those dead entries.

    * Remove OvnWorker: The OvnWorker was responsible for running the
      ovn_db_sync() code and handling the events from OVSDB. Now, this
      patch moved the ovn_db_sync() into the MaintenanceWorker and since
      OVSDB events are now handled by all other workers in parallel there's
      no more use for the OvnWorker and it has been removed from the code.

    * The unittests test_notify_no_ovsdb_lock() and
      test_notify_ovsdb_lock_not_yet_contended() were converged into a single
      test called test_notify_different_target_node(). Different from the
      OVSDB lock that has "locked" and "lock contended" the Hash Ring only
      cares about whether the hash matches with the node_uuid of the
      instance or not.

    Closes-Bug: #1823715
    Change-Id: I00b24cd1f8eaae2386d732af34365fa1f81e565a
    Signed-off-by: Lucas Alvares Gomes <email address hidden>

Changed in networking-ovn:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to networking-ovn (master)

Reviewed: https://review.opendev.org/662484
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=b9af1a1c4cbdf497dadaaa275405ad500df6c9cc
Submitter: Zuul
Branch: master

commit b9af1a1c4cbdf497dadaaa275405ad500df6c9cc
Author: Lucas Alvares Gomes <email address hidden>
Date: Fri May 31 15:05:03 2019 +0100

    Add release note for the Distributed OVSDB events work

    This patch is a follow-up adding a release note for the work adding a
    new mechanism to handle OVSDB events in a distributed way.

    The release note is important to make people aware of the change itself
    as well as the new dependency on the tooz library.

    Related-Bug: #1823715
    Change-Id: I79b2aa4c9f09cc1b0cbff8e0252d28c5f4ea2a2c
    Signed-off-by: Lucas Alvares Gomes <email address hidden>

tags: added: networking-ovn-proactive-backport-potential
tags: added: networking-ovn-easy-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 7.0.0.0b1

This issue was fixed in the openstack/networking-ovn 7.0.0.0b1 development milestone.

tags: removed: networking-ovn-easy-proactive-backport-potential networking-ovn-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.