Bug #1737465 “[cellv2] the performance issue of cellv2 when crea...” : Bugs : OpenStack Compute (nova)

Surya Seetharaman (tssurya) on 2017-12-11

Changed in nova:
assignee:	nobody → Surya Seetharaman (tssurya)

melanie witt (melwitt) on 2018-02-14

tags:

added: cells scheduler

Revision history for this message

melanie witt (melwitt) wrote on 2018-04-05:

#1

Sorry for getting to this so long after it was opened. What do you mean by, "Because cellv2 have disable instance info cache."? I don't notice anything in the code that is deliberately disabling the instance info cache in the HostManager.

Revision history for this message

melanie witt (melwitt) wrote on 2018-04-05:

#2

To be clear, if the cache was disabled, it was unintentional and it is a bug.

Revision history for this message

Jiang (jiangpf) wrote on 2018-05-10:

#3

This is a performance issue. Compute instances info just used by affinity filter.
Anti affinity will judgment the instance whether it belongs to the current computing node. The current code is to determine if the instance uuid is in the instances of the compute node.
We can determine if the current instance exists. If it exists, we get the host it belongs to. Then we judge in anti- affinity whether the host name of the compute node is the same as the instance host name.

Changed in nova:
assignee:	Surya Seetharaman (tssurya) → Jiang (jiangpf)
status:	New → Confirmed

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-05-17:

#4

By "Because cellv2 have disable instance info cache." I assume you mean you have disabled the track_instance_changes configuration option, correct?

https://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.track_instance_changes

Changed in nova:
assignee:	Jiang (jiangpf) → Matt Riedemann (mriedem)
importance:	Undecided → Medium

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-05-17:

#5

By the way, "track_instance_changes" doesn't really have anything to do with cells v2 unless you have a split MQ deployment where the scheduler and compute services (which are in the cells) are on different MQs - because then the computes can't RPC cast to the scheduler topic "at the top" of the deployment, as discussed here:

https://docs.openstack.org/nova/latest/user/cellsv2-layout.html

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-17: Fix proposed to nova (master)

#6

Fix proposed to branch: master
Review: https://review.openstack.org/569218

Changed in nova:
status:	Confirmed → In Progress

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-05-17:

#7

You said, "So there's a performance issue even we don't use affinity filter policy" - are you actually not using any of the ServerGroup(Anti)AffinityFilter or SameHost/DifferentHostFilter filters? If so, we could potentially short-circuit the HostManager._get_instance_info() method to not even pull instances from the database if an affinity filter is not enabled.

That would break any out of tree filters that rely on the HostState.instances information, however, so we'd have to consider that if it were a change we made.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-17: Related fix proposed to nova (master)

#8

Related fix proposed to branch: master
Review: https://review.openstack.org/569247

OpenStack Infra (hudson-openstack) on 2018-05-18

Changed in nova:
assignee:	Matt Riedemann (mriedem) → Eric Fried (efried)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-19: Fix merged to nova (master)

#9

Reviewed: https://review.openstack.org/569218
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a1a335f19a986a593487a802d7b9448f7efc5bab
Submitter: Zuul
Branch: master

commit a1a335f19a986a593487a802d7b9448f7efc5bab
Author: Matt Riedemann <email address hidden>
Date: Thu May 17 14:36:23 2018 -0400

Avoid unnecessary joins in HostManager._get_instances_by_host

    While getting a HostState object for a given compute node during
    scheduling, if the HostState does not have its instance info
    set, either because it's out of date or because config option
    "track_instance_changes" is False, the HostManager still pulls
    the list of instances for that host from the database and stores
    it in HostState.instances.

    This is *only* used (in-tree) by the affinity filters and even
    then the only thing those filters use from HostState.instances
    is the set of keys from the dict, which is the list of instance
    UUIDs on a given host. The actual Instance objects aren't used
    at all. See blueprint put-host-manager-instance-info-on-a-diet
    for more details on that.

    The point of this change, is that when we go to pull the set
    of instances from the database for a given host, we don't need
    to join on the default columns (info_cache and security_groups)
    defined in the _instance_get_all_query() method in the DB API.

    This should be at least some minor optimization in scheduling
    for hosts that have several instances on them in a large cloud.
    As noted in the comment in the code, any out of tree filters
    that rely on using the info_cache or security_groups from the
    instance are now going to be hit with a lazy-load penalty
    per instance, but we have no contract on out of tree filters
    so if this happens, the people maintaining said filters can
    (1) live with it (2) fork the HostManager code or (3) upstream
    their filter so it's in-tree.

    A more impactful change would be to refactor
    HostManager._get_host_states to bulk query the instances on
    the given set of compute nodes in a single query per cell. But
    that is left for a later change.

Change-Id: Iccefbfdfa578515a004ef6ac718bac1a49d5c5fd
Partial-Bug: #1737465

Reviewed:  https://review.openstack.org/569218
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a1a335f19a986a593487a802d7b9448f7efc5bab
Submitter: Zuul
Branch:    master

commit a1a335f19a986a593487a802d7b9448f7efc5bab
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Thu May 17 14:36:23 2018 -0400

Avoid unnecessary joins in HostManager._get_instances_by_host
    
    While getting a HostState object for a given compute node during
    scheduling, if the HostState does not have its instance info
    set, either because it's out of date or because config option
    "track_instance_changes" is False, the HostManager still pulls
    the list of instances for that host from the database and stores
    it in HostState.instances.
    
    This is *only* used (in-tree) by the affinity filters and even
    then the only thing those filters use from HostState.instances
    is the set of keys from the dict, which is the list of instance
    UUIDs on a given host. The actual Instance objects aren't used
    at all. See blueprint put-host-manager-instance-info-on-a-diet
    for more details on that.
    
    The point of this change, is that when we go to pull the set
    of instances from the database for a given host, we don't need
    to join on the default columns (info_cache and security_groups)
    defined in the _instance_get_all_query() method in the DB API.
    
    This should be at least some minor optimization in scheduling
    for hosts that have several instances on them in a large cloud.
    As noted in the comment in the code, any out of tree filters
    that rely on using the info_cache or security_groups from the
    instance are now going to be hit with a lazy-load penalty
    per instance, but we have no contract on out of tree filters
    so if this happens, the people maintaining said filters can
    (1) live with it (2) fork the HostManager code or (3) upstream
    their filter so it's in-tree.
    
    A more impactful change would be to refactor
    HostManager._get_host_states to bulk query the instances on
    the given set of compute nodes in a single query per cell. But
    that is left for a later change.
    
    Change-Id: Iccefbfdfa578515a004ef6ac718bac1a49d5c5fd
    Partial-Bug: #1737465

Eric Fried (efried) on 2018-05-19

Changed in nova:
assignee:	Eric Fried (efried) → Matt Riedemann (mriedem)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-23: Fix proposed to nova (stable/queens)

#10

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/570083

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-30: Fix merged to nova (stable/queens)

#11

Reviewed: https://review.openstack.org/570083
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bfb9d3f179607c25f91d77585f97df6d0704c253
Submitter: Zuul
Branch: stable/queens

commit bfb9d3f179607c25f91d77585f97df6d0704c253
Author: Matt Riedemann <email address hidden>
Date: Thu May 17 14:36:23 2018 -0400

Avoid unnecessary joins in HostManager._get_instances_by_host

    While getting a HostState object for a given compute node during
    scheduling, if the HostState does not have its instance info
    set, either because it's out of date or because config option
    "track_instance_changes" is False, the HostManager still pulls
    the list of instances for that host from the database and stores
    it in HostState.instances.

    This is *only* used (in-tree) by the affinity filters and even
    then the only thing those filters use from HostState.instances
    is the set of keys from the dict, which is the list of instance
    UUIDs on a given host. The actual Instance objects aren't used
    at all. See blueprint put-host-manager-instance-info-on-a-diet
    for more details on that.

    The point of this change, is that when we go to pull the set
    of instances from the database for a given host, we don't need
    to join on the default columns (info_cache and security_groups)
    defined in the _instance_get_all_query() method in the DB API.

    This should be at least some minor optimization in scheduling
    for hosts that have several instances on them in a large cloud.
    As noted in the comment in the code, any out of tree filters
    that rely on using the info_cache or security_groups from the
    instance are now going to be hit with a lazy-load penalty
    per instance, but we have no contract on out of tree filters
    so if this happens, the people maintaining said filters can
    (1) live with it (2) fork the HostManager code or (3) upstream
    their filter so it's in-tree.

    A more impactful change would be to refactor
    HostManager._get_host_states to bulk query the instances on
    the given set of compute nodes in a single query per cell. But
    that is left for a later change.

    Change-Id: Iccefbfdfa578515a004ef6ac718bac1a49d5c5fd
    Partial-Bug: #1737465
    (cherry picked from commit a1a335f19a986a593487a802d7b9448f7efc5bab)

Reviewed:  https://review.openstack.org/570083
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bfb9d3f179607c25f91d77585f97df6d0704c253
Submitter: Zuul
Branch:    stable/queens

commit bfb9d3f179607c25f91d77585f97df6d0704c253
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Thu May 17 14:36:23 2018 -0400

Avoid unnecessary joins in HostManager._get_instances_by_host
    
    While getting a HostState object for a given compute node during
    scheduling, if the HostState does not have its instance info
    set, either because it's out of date or because config option
    "track_instance_changes" is False, the HostManager still pulls
    the list of instances for that host from the database and stores
    it in HostState.instances.
    
    This is *only* used (in-tree) by the affinity filters and even
    then the only thing those filters use from HostState.instances
    is the set of keys from the dict, which is the list of instance
    UUIDs on a given host. The actual Instance objects aren't used
    at all. See blueprint put-host-manager-instance-info-on-a-diet
    for more details on that.
    
    The point of this change, is that when we go to pull the set
    of instances from the database for a given host, we don't need
    to join on the default columns (info_cache and security_groups)
    defined in the _instance_get_all_query() method in the DB API.
    
    This should be at least some minor optimization in scheduling
    for hosts that have several instances on them in a large cloud.
    As noted in the comment in the code, any out of tree filters
    that rely on using the info_cache or security_groups from the
    instance are now going to be hit with a lazy-load penalty
    per instance, but we have no contract on out of tree filters
    so if this happens, the people maintaining said filters can
    (1) live with it (2) fork the HostManager code or (3) upstream
    their filter so it's in-tree.
    
    A more impactful change would be to refactor
    HostManager._get_host_states to bulk query the instances on
    the given set of compute nodes in a single query per cell. But
    that is left for a later change.
    
    Change-Id: Iccefbfdfa578515a004ef6ac718bac1a49d5c5fd
    Partial-Bug: #1737465
    (cherry picked from commit a1a335f19a986a593487a802d7b9448f7efc5bab)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-06-02: Related fix proposed to nova (master)

#12

Related fix proposed to branch: master
Review: https://review.openstack.org/571928

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-24: Related fix merged to nova (master)

#13

Reviewed: https://review.openstack.org/569247
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=91f5af7ee7f7140eafb7237875f6cd6ea1abcd38
Submitter: Zuul
Branch: master

commit 91f5af7ee7f7140eafb7237875f6cd6ea1abcd38
Author: Matt Riedemann <email address hidden>
Date: Thu May 17 16:06:39 2018 -0400

Trim the fat on HostState.instances

    The only in-tree filters that rely on HostState.instances
    are the affinity filters (and one of the weighers). And they
    don't even look at the values in the HostState.instances dict,
    just the keys (which are the instance uuids for the instances
    on the host).

So rather than pull the full instance objects, we can just get
the list of instance uuids off the host and fake out the object.

    Custom filters/weighers will still be able to lazy-load fields
    on the Instance objects in HostState.instances if needed, but
    it will mean a performance penalty due to the round trip to the
    database per instance, per host. Out of tree filters/weighers
    are encouraged to be contributed upstream.

Related to blueprint put-host-manager-instance-info-on-a-diet

Related-Bug: #1737465

Change-Id: I766bb5645e3b598468d092fb9e4f18e720617c52

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-10-25: Change abandoned on nova (master)

#14

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/571928

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-11-06:

#15

I've pushed some related changes for this bug am removing myself as assignee since I think we still have some known performance issues during scheduling, specifically around this method to get all host states from all cells, and for each compute (host state) getting the aggregates and instances from each cell db per host:

https://github.com/openstack/nova/blob/b93b40c6c01a1161f40592d29353c3461669de19/nova/scheduler/host_manager.py#L704

Changed in nova:
assignee:	Matt Riedemann (mriedem) → nobody
status:	In Progress → Confirmed

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-06-05:

#16

Some related changes:

1. Jay Pipes' patch https://review.opendev.org/#/c/623558/ to redo the host mapping / instances query to go from O(N) to O(1) which CERN is using in production.

2. Surya's patch https://review.opendev.org/#/c/635532/ to pre-filter cells based on the allocation candidates (hosts) returned from placement.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-06-05: Related fix proposed to nova (master)

#17

Related fix proposed to branch: master
Review: https://review.opendev.org/663387

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-06-05:

#18

Related fix proposed to branch: master
Review: https://review.opendev.org/663388

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-09: Related fix merged to nova (master)

#19

Reviewed: https://review.opendev.org/663387
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=426a303ee4cb848e75b9ed2acc498479c651c539
Submitter: Zuul
Branch: master

commit 426a303ee4cb848e75b9ed2acc498479c651c539
Author: Matt Riedemann <email address hidden>
Date: Wed Jun 5 12:54:45 2019 -0400

Convert HostMapping.cells to a dict

    This changes the HostMapping.cells list to a dict,
    keyed by cell uuid, to each CellMapping object.
    This is going to be used in a later change [1] where
    we will create a cached mapping of host to cell uuid
    so we can go quickly from compute node host value to
    get the CellMapping without having to lookup the
    HostMapping again.

[1] Ic6b1edfad2e384eb32c6942edc522ee301123cbc

Change-Id: I75e5bf2c45688f68eeca27404641f4c438b023c6
Related-Bug: #1737465

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-12:

#20

Reviewed: https://review.opendev.org/663388
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3a69fdb333e80f5ad2e025a9823b11483fad55c2
Submitter: Zuul
Branch: master

commit 3a69fdb333e80f5ad2e025a9823b11483fad55c2
Author: Matt Riedemann <email address hidden>
Date: Wed Jun 5 13:30:25 2019 -0400

Cache host to cell mapping in HostManager

    If the instances per host are not cached in the HostManager
    we lookup the HostMapping per candidate compute node during
    each scheduling request to get the CellMapping so we can target
    that cell database to pull the instance uuids on the given host.

    For example, if placement returned 20 compute node allocation
    candidates and we don't have the instances cached for any of those,
    we'll do 20 queries to the API DB to get host mappings.

    We can improve this by caching the host to cell uuid after the first
    lookup for a given host and then after that, get the CellMapping
    from the cells cache (which is a dict, keyed by cell uuid, to the
    CellMapping for that cell).

Change-Id: Ic6b1edfad2e384eb32c6942edc522ee301123cbc
Related-Bug: #1737465

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-13: Related fix proposed to nova (stable/stein)

#21

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/676256

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-13:

#22

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/676257

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-29: Related fix merged to nova (stable/stein)

#23

Reviewed: https://review.opendev.org/676256
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=30552298410605ef5710fd564f5e46e5b0df0597
Submitter: Zuul
Branch: stable/stein

commit 30552298410605ef5710fd564f5e46e5b0df0597
Author: Matt Riedemann <email address hidden>
Date: Wed Jun 5 12:54:45 2019 -0400

Convert HostMapping.cells to a dict

    This changes the HostMapping.cells list to a dict,
    keyed by cell uuid, to each CellMapping object.
    This is going to be used in a later change [1] where
    we will create a cached mapping of host to cell uuid
    so we can go quickly from compute node host value to
    get the CellMapping without having to lookup the
    HostMapping again.

[1] Ic6b1edfad2e384eb32c6942edc522ee301123cbc

    Change-Id: I75e5bf2c45688f68eeca27404641f4c438b023c6
    Related-Bug: #1737465
    (cherry picked from commit 426a303ee4cb848e75b9ed2acc498479c651c539)

tags:

added: in-stable-stein

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-29:

#24

Reviewed: https://review.opendev.org/676257
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=677c95280b21eb6628a7758f173cb98c4e3572c2
Submitter: Zuul
Branch: stable/stein

commit 677c95280b21eb6628a7758f173cb98c4e3572c2
Author: Matt Riedemann <email address hidden>
Date: Wed Jun 5 13:30:25 2019 -0400

Cache host to cell mapping in HostManager

    If the instances per host are not cached in the HostManager
    we lookup the HostMapping per candidate compute node during
    each scheduling request to get the CellMapping so we can target
    that cell database to pull the instance uuids on the given host.

    For example, if placement returned 20 compute node allocation
    candidates and we don't have the instances cached for any of those,
    we'll do 20 queries to the API DB to get host mappings.

    We can improve this by caching the host to cell uuid after the first
    lookup for a given host and then after that, get the CellMapping
    from the cells cache (which is a dict, keyed by cell uuid, to the
    CellMapping for that cell).

    Change-Id: Ic6b1edfad2e384eb32c6942edc522ee301123cbc
    Related-Bug: #1737465
    (cherry picked from commit 3a69fdb333e80f5ad2e025a9823b11483fad55c2)

OpenStack Compute (nova)

[cellv2] the performance issue of cellv2 when creating 500 instances concurrently

Bug Description

Other bug subscribers

Related blueprints

Remote bug watches