Resource Tracker performance with Ironic driver

Bug #1816086 reported by Belmiro Moreira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Eric Fried
Rocky
Fix Committed
High
Eric Fried
Stein
Fix Committed
High
Eric Fried

Bug Description

The problem is in rocky.

The resource tracker builds the resource provider tree and it's updated 2 times in "_update_available_resource".
With "_init_compute_node" and in the "_update_available_resource" itself.

The problem is that the RP tree will contain all the ironic RP and all the tree is flushed to placement (2 times as described above) when the periodic task iterate per Ironic RP.

In our case with 1700 ironic nodes, the period task takes:
1700 x (2 x 7s) = ~6h

+++

mitigations:
- shard nova-compute. Have several nova-computes dedicated to ironic.
Most of the current deployments only use 1 nova-compute to avoid resources shuffle/recreation between nova-computes.
Several nova-computes will be need to accommodate the load.

- why do we need to do the full resource provider tree flush to placement and not only the RP that is being considered?
As a work around we are doing this now!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/637225

Changed in nova:
assignee: nobody → Eric Fried (efried)
status: New → In Progress
Revision history for this message
Eric Fried (efried) wrote :

Talked this over with Belmiro in IRC [1].

The bulk of the time is apparently *not* spent talking to placement, but rather locally in the last loop of update_from_provider_tree [2]. Upon inspection, this loop is O(N) just to *find* each provider in the ProviderTree, making the overall loop O(N^2). The proposed patch [3] makes finding a single provider O(1), reducing the overall to O(N).

Let's see if this helps.

[1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-02-15.log.html#t2019-02-15T15:58:13 (interleaved conversation; highlight 'belmoreira' to find the relevant bits)
[2] https://github.com/openstack/nova/blob/880327cc31fea7328d23355730d5458f3b74662b/nova/scheduler/client/report.py#L1439-L1446
[3] https://review.openstack.org/637225

Changed in nova:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/637225
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c797450cbff5194fb6791cd0a07fa060dc8af72
Submitter: Zuul
Branch: master

commit 8c797450cbff5194fb6791cd0a07fa060dc8af72
Author: Eric Fried <email address hidden>
Date: Fri Feb 15 10:54:36 2019 -0600

    Perf: Use dicts for ProviderTree roots

    ProviderTree used to keep track of root providers in a list. Since we
    don't yet have sharing providers, this would always be a list of one for
    non-ironic deployments, or N for ironic deployments of N nodes.

    To find a provider (by name or UUID), we would iterate over this list,
    an O(N) operation. For large ironic deployments, this added up fast -
    see the referenced bug.

    With this change, we store roots in two dicts: one keyed by UUID, one
    keyed by name. To find a provider, we first check these dicts. If the
    provider we're looking for is a root, this is now O(1). (If it's a
    child, it would still be O(N), because we iterate over all the roots
    looking for a descendant that matches. But ironic deployments don't have
    child providers (yet?) (right?) so that should be n/a. For non-ironic
    deployments it's unchanged: O(M) where M is the number of descendants,
    which should be very small for the time being.)

    Test note: Existing tests in nova.tests.unit.compute.test_provider_tree
    thoroughly cover all the affected code paths. There was one usage of
    ProviderTree.roots that was untested and broken (even before this
    change) which is now fixed.

    Change-Id: Ibf430a8bc2a2af9353b8cdf875f8506377a1c9c2
    Closes-Bug: #1816086

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
tags: added: ironic performance resource-tracker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/670179

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/670182

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/670179
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=754d8eb76c6fcd539fc2c32f0e2f90f6503b9f0c
Submitter: Zuul
Branch: stable/stein

commit 754d8eb76c6fcd539fc2c32f0e2f90f6503b9f0c
Author: Eric Fried <email address hidden>
Date: Fri Feb 15 10:54:36 2019 -0600

    Perf: Use dicts for ProviderTree roots

    ProviderTree used to keep track of root providers in a list. Since we
    don't yet have sharing providers, this would always be a list of one for
    non-ironic deployments, or N for ironic deployments of N nodes.

    To find a provider (by name or UUID), we would iterate over this list,
    an O(N) operation. For large ironic deployments, this added up fast -
    see the referenced bug.

    With this change, we store roots in two dicts: one keyed by UUID, one
    keyed by name. To find a provider, we first check these dicts. If the
    provider we're looking for is a root, this is now O(1). (If it's a
    child, it would still be O(N), because we iterate over all the roots
    looking for a descendant that matches. But ironic deployments don't have
    child providers (yet?) (right?) so that should be n/a. For non-ironic
    deployments it's unchanged: O(M) where M is the number of descendants,
    which should be very small for the time being.)

    Test note: Existing tests in nova.tests.unit.compute.test_provider_tree
    thoroughly cover all the affected code paths. There was one usage of
    ProviderTree.roots that was untested and broken (even before this
    change) which is now fixed.

    Change-Id: Ibf430a8bc2a2af9353b8cdf875f8506377a1c9c2
    Closes-Bug: #1816086
    (cherry picked from commit 8c797450cbff5194fb6791cd0a07fa060dc8af72)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/670182
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=00e5e3a7443dd31720244497013b040729490dcd
Submitter: Zuul
Branch: stable/rocky

commit 00e5e3a7443dd31720244497013b040729490dcd
Author: Eric Fried <email address hidden>
Date: Fri Feb 15 10:54:36 2019 -0600

    Perf: Use dicts for ProviderTree roots

    ProviderTree used to keep track of root providers in a list. Since we
    don't yet have sharing providers, this would always be a list of one for
    non-ironic deployments, or N for ironic deployments of N nodes.

    To find a provider (by name or UUID), we would iterate over this list,
    an O(N) operation. For large ironic deployments, this added up fast -
    see the referenced bug.

    With this change, we store roots in two dicts: one keyed by UUID, one
    keyed by name. To find a provider, we first check these dicts. If the
    provider we're looking for is a root, this is now O(1). (If it's a
    child, it would still be O(N), because we iterate over all the roots
    looking for a descendant that matches. But ironic deployments don't have
    child providers (yet?) (right?) so that should be n/a. For non-ironic
    deployments it's unchanged: O(M) where M is the number of descendants,
    which should be very small for the time being.)

    Test note: Existing tests in nova.tests.unit.compute.test_provider_tree
    thoroughly cover all the affected code paths. There was one usage of
    ProviderTree.roots that was untested and broken (even before this
    change) which is now fixed.

    Conflicts (rocky backport):
      nova/compute/provider_tree.py
         The return_root kwarg to _find_with_lock was added in stein.
      nova/tests/unit/virt/libvirt/test_driver.py
         and
      nova/virt/libvirt/driver.py
         are n/a because the code iterating over the provider tree roots was
         added in stein (for vgpu handling)

    Change-Id: Ibf430a8bc2a2af9353b8cdf875f8506377a1c9c2
    Closes-Bug: #1816086
    (cherry picked from commit 8c797450cbff5194fb6791cd0a07fa060dc8af72)
    (cherry picked from commit 754d8eb76c6fcd539fc2c32f0e2f90f6503b9f0c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.2

This issue was fixed in the openstack/nova 19.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.2.2

This issue was fixed in the openstack/nova 18.2.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.