schedule_and_build_instances looks up host az for every instance even if using the same host

Bug #1785327 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matt Riedemann
Pike
Confirmed
Low
Unassigned
Queens
Confirmed
Low
Unassigned
Rocky
Fix Committed
Low
Matt Riedemann

Bug Description

This is a simple performance optimization bug. When creating multiple servers, if they are in an affinity group they are going to be on the same host. Or if simply the scheduler configuration is such that instances are packed onto as few hosts as possible rather than spread, the scheduler could return several of the same hosts for the list of instances being scheduled. We iterate over the instances and their selected hosts and we lookup the az for each host in the loop which is a query to the aggregates table in the API DB. If we have >1 of the same host in the list, we could optimize this by simply caching the host=az mapping.

https://github.com/openstack/nova/blob/4c37ff72e5446c835a48d569dd5a1416fcd36c71/nova/conductor/manager.py#L1263

Matt Riedemann (mriedem)
Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/588665

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/588665
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=27857c337378472205c37db6ab79fe8404406129
Submitter: Zuul
Branch: master

commit 27857c337378472205c37db6ab79fe8404406129
Author: Matt Riedemann <email address hidden>
Date: Fri Aug 3 17:26:00 2018 -0400

    Optimize AZ lookup during schedule_and_build_instances

    If we're creating multiple servers, there is a chance
    the scheduler returned the same host for more than
    one of them, which means we could be redundantly
    looking up the AZ for the same host multiple times.
    This could happen when creating multiple servers in
    the same strict affinity group, or simply if the
    scheduler is configured with a pack-first strategy
    for filling up hosts. The get_host_availability_zone()
    method does not use its own internal cache, so this
    change adds a simple cache to the schedule_and_build_instances
    method itself so that we only lookup the AZ per unique
    host once.

    Note that build_instances suffers from the same issue
    but that is only called for scheduling with cells v1
    which is deprecated so we shouldn't need to care about
    optimizing that method.

    Change-Id: I2ae5ae7240e5183acca7492ddad017c0c878835b
    Closes-Bug: #1785327

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/604378

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/604378
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3480454a4ba9746e3066edef0945c02ec24af07e
Submitter: Zuul
Branch: stable/rocky

commit 3480454a4ba9746e3066edef0945c02ec24af07e
Author: Matt Riedemann <email address hidden>
Date: Fri Aug 3 17:26:00 2018 -0400

    Optimize AZ lookup during schedule_and_build_instances

    If we're creating multiple servers, there is a chance
    the scheduler returned the same host for more than
    one of them, which means we could be redundantly
    looking up the AZ for the same host multiple times.
    This could happen when creating multiple servers in
    the same strict affinity group, or simply if the
    scheduler is configured with a pack-first strategy
    for filling up hosts. The get_host_availability_zone()
    method does not use its own internal cache, so this
    change adds a simple cache to the schedule_and_build_instances
    method itself so that we only lookup the AZ per unique
    host once.

    Note that build_instances suffers from the same issue
    but that is only called for scheduling with cells v1
    which is deprecated so we shouldn't need to care about
    optimizing that method.

    Change-Id: I2ae5ae7240e5183acca7492ddad017c0c878835b
    Closes-Bug: #1785327
    (cherry picked from commit 27857c337378472205c37db6ab79fe8404406129)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.2

This issue was fixed in the openstack/nova 18.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.