Make host_manager use scatter-gather and ignore down cells

Bug #1746561 reported by Surya Seetharaman
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Dan Smith
Pike
Confirmed
Medium
Unassigned
Queens
Fix Committed
Medium
Elod Illes

Bug Description

Currently the "_get_computes_for_cells" function in the host_manager of scheduler runs sequentially and this affects the performance in case of large deployments (running a lot of cells) :

https://github.com/openstack/nova/blob/stable/pike/nova/scheduler/host_manager.py#L601

So it would be nice to use the scatter_gather_all_cells function to do this operation in parallel.

Also apart from the performance scaling point of view, in case connection to a particular cell fails, it would be nice to have sentinels returned which is done by the scatter_gather_all_cells. This helps when a cell is down.

Tags: cells
Changed in nova:
status: New → In Progress
Changed in nova:
assignee: Dan Smith (danms) → Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Dan Smith (danms)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/539617
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fdea8b723ba5a25ea9dc0917401fbb1401e05ee3
Submitter: Zuul
Branch: master

commit fdea8b723ba5a25ea9dc0917401fbb1401e05ee3
Author: Dan Smith <email address hidden>
Date: Wed Jan 31 09:30:11 2018 -0800

    Make host_manager use scatter-gather and ignore down cells

    This makes the host_manager query for computes in parallel across all the
    cells. It also ignores cells that fail or time out so that scheduling can
    proceed.

    Closes-Bug: #1746561
    Change-Id: I48d8b763f475c010fa48ee1db232a6d3ae75f5e6

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b2

This issue was fixed in the openstack/nova 18.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/637599

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/637599
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6933c880b8cea760c1e930b320b3d44b1e7e1b53
Submitter: Zuul
Branch: stable/queens

commit 6933c880b8cea760c1e930b320b3d44b1e7e1b53
Author: Dan Smith <email address hidden>
Date: Wed Jan 31 09:30:11 2018 -0800

    Make host_manager use scatter-gather and ignore down cells

    This makes the host_manager query for computes in parallel across all the
    cells. It also ignores cells that fail or time out so that scheduling can
    proceed.

    Closes-Bug: #1746561
    Change-Id: I48d8b763f475c010fa48ee1db232a6d3ae75f5e6
    (cherry picked from commit fdea8b723ba5a25ea9dc0917401fbb1401e05ee3)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.10

This issue was fixed in the openstack/nova 17.0.10 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.