‘limit’ in allocation_candidates where sometimes make force_hosts invalid

Bug #1777591 reported by xulei
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
xulei
Queens
Fix Committed
High
xulei

Bug Description

   Now 'limit' parameter in allocation_candidates api use default value 1000, this makes better performace in large scale environment. However, when creating a vm/bm with force_hosts to schedule, 'limit' parameter will cut some nodes out in allocation_candidates, and sometimes force_hosts method returns 'No hosts matched due to not matching...'
Example:
   test environment with 10 compute nodes, set max_placement_results = 3
   nova boot test --image 9c09cb52-03b9-4631-898d-d443d0dbbf9e --flavor c1 --nic none --availability-zone nova:devstack
   No hosts matched due to not matching 'force_hosts' value of 'devstack'

   Debug:
   return provider_summaries:
   {u'268a3d69-6cf1-418a-aaa8-f2127f4f4468':...,u'a2c3e9e7-53a6-4e15-b150-39bb4135c6a9':...u'0aa80b5e-a0fa-47a6-a4b5-51b21b721ce9':...}
   and node devstack:69d2fe55-e391-4d99-a1fe-8b0b5aad60e7 not in provider_summaries.
   I think in large scale environment(compute nodes > 2000), and set default max_placement_results=1000 will make force_hosts unavailable.

Revision history for this message
xulei (605423512-j) wrote :

'No hosts matched due to not matching ...' in logs let me confused and should correct it.

Changed in nova:
assignee: nobody → xulei (605423512-j)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/576693

tags: added: placement scheduler
Changed in nova:
status: New → In Progress
Revision history for this message
Chris Dent (cdent) wrote : Re: ‘limit’ in allocation_candidates where sometimes make fore_hosts invalid

I was going to suggest that one way to address this might be to use a nova-scheduler request filter to narrow the query space that is being looked at when querying placement. However we don't currently support an `rp_uuid=in:{uuid},{uuid},{uuid}` parameter on GET /allocation_candidates

As I recall 'force_hosts' has plenty of other problems too?

Revision history for this message
Matt Riedemann (mriedem) wrote :

We would have this same problem with rebuild if we need to go through the scheduler again (in case the image changes during rebuild). So yeah we likely need a pre-filter on this, and it was discussed in this spec:

https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html

See the "Dealing with rebuild" sections.

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → High
Revision history for this message
Eric Fried (efried) wrote :

Agreed in the scheduler meeting [1]:

In Stein, add a placement microversion to allow specifying provider UUID(s) to GET /allocation_candidates.

In Rocky, backportable, disable limit if force_host is set.

[1] http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-06-25-14.02.log.html#l-28

Jay Pipes (jaypipes)
summary: - ‘limit’ in allocation_candidates where sometimes make fore_hosts invalid
+ ‘limit’ in allocation_candidates where sometimes make force_hosts
+ invalid
Revision history for this message
Chris Dent (cdent) wrote :

To summarize some of the conversation related to the planned fix for stein, where something like `uuid=in:{uuid},{uuid}` would be added:

The challenge with this is determining which resource providers those uuids would match. In a nested or sharing scenario, we presumably don't want to require (since we may not even know them) that all the resource provider uuids in the collection of allocations for this one vm be represented in the in `uuid` parameter.

We could make it so that the uuids are only "anchor" providers, which means in the "force_hosts" case the uuids of the resource providers representing the hosts being targeted would be listed in the parameter.

This makes some logical sense, but may be a nova-ism.

It's also not clear if the idea of an "anchor" or "target" is sufficiently clear in the database query code to be able to do the equivalent of "where anchor.uuid in(uuids_from_param)".

Changed in nova:
assignee: xulei (605423512-j) → Eric Fried (efried)
Changed in nova:
assignee: Eric Fried (efried) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → xulei (xulei)
assignee: xulei (xulei) → xulei (605423512-j)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/576693
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1d91811ad499af1d291f5c819ced5b1fdf3520c7
Submitter: Zuul
Branch: master

commit 1d91811ad499af1d291f5c819ced5b1fdf3520c7
Author: xulei <email address hidden>
Date: Wed Jun 20 13:15:46 2018 +0800

    Disable limits if force_hosts or force_nodes is set

    Setting max_placement_results will make force_host invaild sometimes,
    especially in large-scale enviroment.
    Disable limit param in GET /allocation_candidates if force_hosts
    or force_nodes is set.

    Change-Id: Iff1b49fe7e6347e3c2bb5992494b2450809719a2
    Closes-Bug: #1777591

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/584616

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b3

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/584616
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ab1fd87ed9563ce8cf865ff539916e5d804853dc
Submitter: Zuul
Branch: stable/queens

commit ab1fd87ed9563ce8cf865ff539916e5d804853dc
Author: xulei <email address hidden>
Date: Wed Jun 20 13:15:46 2018 +0800

    Disable limits if force_hosts or force_nodes is set

    Setting max_placement_results will make force_host invaild sometimes,
    especially in large-scale enviroment.
    Disable limit param in GET /allocation_candidates if force_hosts
    or force_nodes is set.

    NOTE(xulei): There are differences from the original change because
    I496e8d64907fdcb0e2da255725aed1fc529725f2 was not in stable/queens,
    so we transplant code to get_allocation_candidates in this backport.

    Change-Id: Iff1b49fe7e6347e3c2bb5992494b2450809719a2
    Closes-Bug: #1777591
    (cherry picked from commit 1d91811ad499af1d291f5c819ced5b1fdf3520c7)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.6

This issue was fixed in the openstack/nova 17.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/649535
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=575fd08e63119900969d1f6784034772c7ab450b
Submitter: Zuul
Branch: master

commit 575fd08e63119900969d1f6784034772c7ab450b
Author: Tetsuro Nakamura <email address hidden>
Date: Tue Apr 2 08:27:49 2019 +0000

    Query `in_tree` to placement

    This patch adds the translation of `RequestGroup.in_tree` to the
    actual placement query and bumps microversion to enable it.

    The release note for this change is added.

    Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
    Blueprint: use-placement-in-tree
    Related-Bug: #1777591

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.