Improve Performance of n-odl DB operations

Bug #1791348 reported by Sai Sindhur Malleni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-odl
Fix Released
Undecided
Unassigned

Bug Description

When running a performance and scale testing scenario to create 100 networks, subnets and one VM per subnet, we observe that the function
get_pending_or_processing_ops is called 5636 times (total time of 0.426 seconds and cumulative time of 135 seconds) and get_oldest_pending_db_row_with_lock is called 8450 times (total time 1.747 seconds and cumulative time 685 seconds).
We can use baked queries(http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html) to improve this.

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-odl (master)

Reviewed: https://review.openstack.org/591363
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=7fdbd351066e7014052caa9cc3af7f74b7952990
Submitter: Zuul
Branch: master

commit 7fdbd351066e7014052caa9cc3af7f74b7952990
Author: Sai Sindhur Malleni <email address hidden>
Date: Mon Aug 13 16:36:55 2018 +0530

    Implement Baked Query

    When running a performance and scale testing scenario to create
    100 networks, subnets and one VM per subnet, we observe that the function
    get_pending_or_processing_ops is called 5636 times (total time of 0.426 seconds
    and cumulative time of 135 seconds) and get_oldest_pending_db_row_with_lock
    is called 8450 times (total time 1.747 seconds and cumulative time 685 seconds).
    This patch implements baked queries to reduce time spent in these functions.
    The "baked" query system in SQLAlchemy caches the string
    SQL statement for a particular Query-construction path. As
    some profiling in Neutron have observed 30-40% time spent in
    constructing Query objects due to the many small "fetch-one-object"
    style of query, the "baked" system may potentially be able to
    greatly reduce this in neutron and its drivers. Networkind-odl is
    traditionally known to slow down things due to journaling and this
    patch aims to improve that situation. Baked Query Docs:
    http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html

    Profiling networking-odl, we see that the with baked queries
    time spent in get_pending_or_processing_ops reduced by 40.7%
    and time spent in get_oldest_pending_db_row_with_lock reduced by 60.6%.

    Closes-Bug: 1791348
    Change-Id: Ieb5f14e100ea48facc1bfd2169e29b42617c6ac1

Changed in networking-odl:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-odl (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/604755

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-odl (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/607149

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-odl (stable/rocky)

Reviewed: https://review.openstack.org/607149
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=ebf603fb438058d01b889b20780bfb6a309b1ec2
Submitter: Zuul
Branch: stable/rocky

commit ebf603fb438058d01b889b20780bfb6a309b1ec2
Author: Sai Sindhur Malleni <email address hidden>
Date: Mon Aug 13 16:36:55 2018 +0530

    Implement Baked Query

    When running a performance and scale testing scenario to create
    100 networks, subnets and one VM per subnet, we observe that the function
    get_pending_or_processing_ops is called 5636 times (total time of 0.426 seconds
    and cumulative time of 135 seconds) and get_oldest_pending_db_row_with_lock
    is called 8450 times (total time 1.747 seconds and cumulative time 685 seconds).
    This patch implements baked queries to reduce time spent in these functions.
    The "baked" query system in SQLAlchemy caches the string
    SQL statement for a particular Query-construction path. As
    some profiling in Neutron have observed 30-40% time spent in
    constructing Query objects due to the many small "fetch-one-object"
    style of query, the "baked" system may potentially be able to
    greatly reduce this in neutron and its drivers. Networkind-odl is
    traditionally known to slow down things due to journaling and this
    patch aims to improve that situation. Baked Query Docs:
    http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html

    Profiling networking-odl, we see that the with baked queries
    time spent in get_pending_or_processing_ops reduced by 40.7%
    and time spent in get_oldest_pending_db_row_with_lock reduced by 60.6%.

    Closes-Bug: 1791348
    Change-Id: Ieb5f14e100ea48facc1bfd2169e29b42617c6ac1
    (cherry picked from commit 7fdbd351066e7014052caa9cc3af7f74b7952990)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-odl (stable/queens)

Reviewed: https://review.openstack.org/604755
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=d0821278f610b72f1268a81d9d458bf0e97a1180
Submitter: Zuul
Branch: stable/queens

commit d0821278f610b72f1268a81d9d458bf0e97a1180
Author: Sai Sindhur Malleni <email address hidden>
Date: Mon Aug 13 16:36:55 2018 +0530

    Implement Baked Query

    When running a performance and scale testing scenario to create
    100 networks, subnets and one VM per subnet, we observe that the function
    get_pending_or_processing_ops is called 5636 times (total time of 0.426 seconds
    and cumulative time of 135 seconds) and get_oldest_pending_db_row_with_lock
    is called 8450 times (total time 1.747 seconds and cumulative time 685 seconds).
    This patch implements baked queries to reduce time spent in these functions.
    The "baked" query system in SQLAlchemy caches the string
    SQL statement for a particular Query-construction path. As
    some profiling in Neutron have observed 30-40% time spent in
    constructing Query objects due to the many small "fetch-one-object"
    style of query, the "baked" system may potentially be able to
    greatly reduce this in neutron and its drivers. Networkind-odl is
    traditionally known to slow down things due to journaling and this
    patch aims to improve that situation. Baked Query Docs:
    http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html

    Profiling networking-odl, we see that the with baked queries
    time spent in get_pending_or_processing_ops reduced by 40.7%
    and time spent in get_oldest_pending_db_row_with_lock reduced by 60.6%.

    NOTE: This is not a clean cherry-pick. There is a series of changes [1]
    in master and stable/rocky that refactor how the database interactions
    are done and precede this change in the git history. Insofar we have not
    reached a decision if we want to backport these changes to stable/queens
    and thus this patch had to be adapted as if the series of changes was
    not present. In particular, the changes that affect this patch the most
    are [2] and [3]. The absence of the aforementioned patches does not
    affect the functionality of this adapted patch.

    [1] https://review.openstack.org/#/q/project:openstack/networking-odl+branch:master+topic:bp/enginefacade-switch
    [2] 179fa53a8e6f60609068df5c1f4f058181b2f928
    [3] 1b1962896878d710adb43b712c6db6af862d8352

    Closes-Bug: 1791348
    Change-Id: Ieb5f14e100ea48facc1bfd2169e29b42617c6ac1
    (cherry picked from commit 7fdbd351066e7014052caa9cc3af7f74b7952990)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-odl 14.0.0.0b1

This issue was fixed in the openstack/networking-odl 14.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-odl 12.0.1

This issue was fixed in the openstack/networking-odl 12.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-odl 13.0.1

This issue was fixed in the openstack/networking-odl 13.0.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.