‘limit’ in allocation_candidates where sometimes make force_hosts invalid

Bug #1777591 reported by xulei on 2018-06-19
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
xulei
Queens
High
xulei

Bug Description

   Now 'limit' parameter in allocation_candidates api use default value 1000, this makes better performace in large scale environment. However, when creating a vm/bm with force_hosts to schedule, 'limit' parameter will cut some nodes out in allocation_candidates, and sometimes force_hosts method returns 'No hosts matched due to not matching...'
Example:
   test environment with 10 compute nodes, set max_placement_results = 3
   nova boot test --image 9c09cb52-03b9-4631-898d-d443d0dbbf9e --flavor c1 --nic none --availability-zone nova:devstack
   No hosts matched due to not matching 'force_hosts' value of 'devstack'

   Debug:
   return provider_summaries:
   {u'268a3d69-6cf1-418a-aaa8-f2127f4f4468':...,u'a2c3e9e7-53a6-4e15-b150-39bb4135c6a9':...u'0aa80b5e-a0fa-47a6-a4b5-51b21b721ce9':...}
   and node devstack:69d2fe55-e391-4d99-a1fe-8b0b5aad60e7 not in provider_summaries.
   I think in large scale environment(compute nodes > 2000), and set default max_placement_results=1000 will make force_hosts unavailable.

xulei (605423512-j) wrote :

'No hosts matched due to not matching ...' in logs let me confused and should correct it.

Changed in nova:
assignee: nobody → xulei (605423512-j)
tags: added: placement scheduler
Changed in nova:
status: New → In Progress

I was going to suggest that one way to address this might be to use a nova-scheduler request filter to narrow the query space that is being looked at when querying placement. However we don't currently support an `rp_uuid=in:{uuid},{uuid},{uuid}` parameter on GET /allocation_candidates

As I recall 'force_hosts' has plenty of other problems too?

Matt Riedemann (mriedem) wrote :

We would have this same problem with rebuild if we need to go through the scheduler again (in case the image changes during rebuild). So yeah we likely need a pre-filter on this, and it was discussed in this spec:

https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html

See the "Dealing with rebuild" sections.

Matt Riedemann (mriedem) on 2018-06-25
Changed in nova:
importance: Undecided → High
Eric Fried (efried) wrote :

Agreed in the scheduler meeting [1]:

In Stein, add a placement microversion to allow specifying provider UUID(s) to GET /allocation_candidates.

In Rocky, backportable, disable limit if force_host is set.

[1] http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-06-25-14.02.log.html#l-28

Jay Pipes (jaypipes) on 2018-06-26
summary: - ‘limit’ in allocation_candidates where sometimes make fore_hosts invalid
+ ‘limit’ in allocation_candidates where sometimes make force_hosts
+ invalid
Chris Dent (cdent) wrote :

To summarize some of the conversation related to the planned fix for stein, where something like `uuid=in:{uuid},{uuid}` would be added:

The challenge with this is determining which resource providers those uuids would match. In a nested or sharing scenario, we presumably don't want to require (since we may not even know them) that all the resource provider uuids in the collection of allocations for this one vm be represented in the in `uuid` parameter.

We could make it so that the uuids are only "anchor" providers, which means in the "force_hosts" case the uuids of the resource providers representing the hosts being targeted would be listed in the parameter.

This makes some logical sense, but may be a nova-ism.

It's also not clear if the idea of an "anchor" or "target" is sufficiently clear in the database query code to be able to do the equivalent of "where anchor.uuid in(uuids_from_param)".

Changed in nova:
assignee: xulei (605423512-j) → Eric Fried (efried)
Changed in nova:
assignee: Eric Fried (efried) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem) on 2018-07-19
Changed in nova:
assignee: Matt Riedemann (mriedem) → xulei (xulei)
assignee: xulei (xulei) → xulei (605423512-j)

Reviewed: https://review.openstack.org/576693
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1d91811ad499af1d291f5c819ced5b1fdf3520c7
Submitter: Zuul
Branch: master

commit 1d91811ad499af1d291f5c819ced5b1fdf3520c7
Author: xulei <email address hidden>
Date: Wed Jun 20 13:15:46 2018 +0800

    Disable limits if force_hosts or force_nodes is set

    Setting max_placement_results will make force_host invaild sometimes,
    especially in large-scale enviroment.
    Disable limit param in GET /allocation_candidates if force_hosts
    or force_nodes is set.

    Change-Id: Iff1b49fe7e6347e3c2bb5992494b2450809719a2
    Closes-Bug: #1777591

Changed in nova:
status: In Progress → Fix Released

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

Reviewed: https://review.openstack.org/584616
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ab1fd87ed9563ce8cf865ff539916e5d804853dc
Submitter: Zuul
Branch: stable/queens

commit ab1fd87ed9563ce8cf865ff539916e5d804853dc
Author: xulei <email address hidden>
Date: Wed Jun 20 13:15:46 2018 +0800

    Disable limits if force_hosts or force_nodes is set

    Setting max_placement_results will make force_host invaild sometimes,
    especially in large-scale enviroment.
    Disable limit param in GET /allocation_candidates if force_hosts
    or force_nodes is set.

    NOTE(xulei): There are differences from the original change because
    I496e8d64907fdcb0e2da255725aed1fc529725f2 was not in stable/queens,
    so we transplant code to get_allocation_candidates in this backport.

    Change-Id: Iff1b49fe7e6347e3c2bb5992494b2450809719a2
    Closes-Bug: #1777591
    (cherry picked from commit 1d91811ad499af1d291f5c819ced5b1fdf3520c7)

This issue was fixed in the openstack/nova 17.0.6 release.

Reviewed: https://review.opendev.org/649535
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=575fd08e63119900969d1f6784034772c7ab450b
Submitter: Zuul
Branch: master

commit 575fd08e63119900969d1f6784034772c7ab450b
Author: Tetsuro Nakamura <email address hidden>
Date: Tue Apr 2 08:27:49 2019 +0000

    Query `in_tree` to placement

    This patch adds the translation of `RequestGroup.in_tree` to the
    actual placement query and bumps microversion to enable it.

    The release note for this change is added.

    Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
    Blueprint: use-placement-in-tree
    Related-Bug: #1777591

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers