StarlingX

openstack-armada-app helm plugins need optimization

Bug #1886563 reported by Joseph Richard on 2020-07-06

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Fix Released	Medium	Angie Wang

Bug Description

Brief Description
-----------------
openstack-armada-app helm plugin dbapi calls scale linearly to the number of worker nodes, which results in very-poor performance when using a large number of nodes. This should be reduced to use a constant number of dbapi calls through optimization and caching.

Severity
--------
Provide the severity of the defect.
Major: blocks using large system

Steps to Reproduce
------------------
Install OpenStack on system with 18 worker nodes
Attempt to lock host.

Expected Behavior
------------------
evaluate_app_apply finishes in less than 60 seconds

Actual Behavior
----------------
lock fails because evaluate_app_reapply takes longer than 60 seconds

Reproducibility
---------------
Reproducible

System Configuration
--------------------
large (18-worker) system

Last Pass
---------
unknown

Workaround
----------
Increase rpc_response_timeout

See original description

Tags:

Frank Miller (sensfan22) on 2020-07-07

Changed in starlingx:
assignee:	nobody → Suvro Ghosh (suvr0)

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2020-07-15:

stx.5.0 / medium priority - scaling issue with a large system

description:	updated
description:	updated
tags:	added: stx.containers
tags:	added: stx.5.0
Changed in starlingx:
importance:	Undecided → Medium
status:	New → Triaged

Revision history for this message

Frank Miller (sensfan22) wrote on 2020-08-11:

Re-assigning to Angie to implement a solution.

Changed in starlingx:
assignee:	Suvro Ghosh (suvr0) → Angie Wang (angiewang)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-14: Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/746268

Changed in starlingx:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-14: Fix proposed to openstack-armada-app (master)

Fix proposed to branch: master
Review: https://review.opendev.org/746269

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-14: Fix merged to openstack-armada-app (master)

Reviewed: https://review.opendev.org/746269
Committed: https://git.openstack.org/cgit/starlingx/openstack-armada-app/commit/?id=b21dfbda2285d8f7d70f7848892ee76beffcd5fc
Submitter: Zuul
Branch: master

commit b21dfbda2285d8f7d70f7848892ee76beffcd5fc
Author: Angie Wang <email address hidden>
Date: Thu Aug 13 16:32:28 2020 -0400

Optimize nova and neutron helm plugins

    The dbapi calls in nova and neutron plugins scale linearly with
    the number of worker nodes which results in poor performance on
    a large number of nodes system.

Currently, the dbapi calls get invoked for each of worker node.
This commit reduces it to a certain number of calls.

    Tested stx-openstack upload and apply on a lab with 6 worker nodes.
    The time of override generation for nova reduced from 11s to 0.6s
    and for neutron reduced from 24s to 0.2s.

Also tested on vbox and lab has "vf" type of interface. Verified
the content of generated overrides are same as before.

    Change-Id: I2d19d30b01e3348d6eb60b8e2681e3a30ef93ebc
    Partial-Bug: 1886563
    Signed-off-by: Angie Wang <email address hidden>

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-14: Fix merged to config (master)

Reviewed: https://review.opendev.org/746268
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=039e033de76c1213dd324b7b72443be668670298
Submitter: Zuul
Branch: master

commit 039e033de76c1213dd324b7b72443be668670298
Author: Angie Wang <email address hidden>
Date: Fri Aug 14 00:20:32 2020 -0400

Updates according to the nova and neutron plugins optimization

    In the commit https://review.opendev.org/#/c/746269/, we keep some
    duplicate functions from sysinv/common/interface.py in nova helm
    plugin with the modification to use cached data instead of querying
    DB to improve performance. Add a note in sysinv/common/interface.py
    to indicate any updates from the origin should be reflected in the
    duplicates in nova helm plugin as well.

    We also optimize neutron plugin to use the same python object to
    generate neutron helm overrides dictionary to avoid uincessary dbapi
    calls but if python dictionary references to the same object, YAML
    can regonize and use aliases/anchors to place *id001/&id001 in yaml file.

For example,
host_overrides = self._get_per_host_overrides()

    overrides =
    {
      ...
      'overrides': {
          'neutron_ovs-agent': {
              'hosts': host_overrides
           },
          'neutron_dhcp-agent': {
              'hosts': host_overrides
      ...
     }
    ...

    overrides['overrides']['neutron_ovs-agent']['hosts'] and
    overrides['overrides']['neutron_dhcp-agent']['hosts'] are pointing to
    the same object "host_overrides".

    With yaml aliases enabled, the neutron override yaml file will have
    overrides:
      neutron_ovs-agent:
        hosts: &id001
          ...
          name: controller-0
      neutron_sriov-agent:
        hosts: *id001

So disable YAML aliases for readable output in generated yaml files,
but aliases feature doesn't affect the functionality.

    Change-Id: I07e13d1a260bb73f4a58e4fcef786cc246ce48e8
    Depends-On: https://review.opendev.org/#/c/746269/
    Closes-Bug: 1886563
    Signed-off-by: Angie Wang <email address hidden>

Reviewed:  https://review.opendev.org/746268
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=039e033de76c1213dd324b7b72443be668670298
Submitter: Zuul
Branch:    master

commit 039e033de76c1213dd324b7b72443be668670298
Author: Angie Wang <angie.wang@windriver.com>
Date:   Fri Aug 14 00:20:32 2020 -0400

Updates according to the nova and neutron plugins optimization
    
    In the commit https://review.opendev.org/#/c/746269/, we keep some
    duplicate functions from sysinv/common/interface.py in nova helm
    plugin with the modification to use cached data instead of querying
    DB to improve performance. Add a note in sysinv/common/interface.py
    to indicate any updates from the origin should be reflected in the
    duplicates in nova helm plugin as well.
    
    We also optimize neutron plugin to use the same python object to
    generate neutron helm overrides dictionary to avoid uincessary dbapi
    calls but if python dictionary references to the same object, YAML
    can regonize and use aliases/anchors to place *id001/&id001 in yaml file.
    
    For example,
    host_overrides = self._get_per_host_overrides()
    
    overrides =
    {
      ...
      'overrides': {
          'neutron_ovs-agent': {
              'hosts': host_overrides
           },
          'neutron_dhcp-agent': {
              'hosts': host_overrides
      ...
     }
    ...
    
    overrides['overrides']['neutron_ovs-agent']['hosts'] and
    overrides['overrides']['neutron_dhcp-agent']['hosts'] are pointing to
    the same object "host_overrides".
    
    With yaml aliases enabled, the neutron override yaml file will have
    overrides:
      neutron_ovs-agent:
        hosts: &id001
          ...
          name: controller-0
      neutron_sriov-agent:
        hosts: *id001
    
    So disable YAML aliases for readable output in generated yaml files,
    but aliases feature doesn't affect the functionality.
    
    Change-Id: I07e13d1a260bb73f4a58e4fcef786cc246ce48e8
    Depends-On: https://review.opendev.org/#/c/746269/
    Closes-Bug: 1886563
    Signed-off-by: Angie Wang <angie.wang@windriver.com>

Changed in starlingx:
status:	In Progress → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.