2020-04-28 12:40:22 |
Sylvain Bauza |
description |
When boot more than one instance with accelerator, and the accelerators are in one compute node, there will be two problems as below:
One problem is as we always get the first item(alloc_reqs[0]) in alloc_reqs, when we iterator the second instance, it will throw conflict exception when putting the allocations.
Another is as we always get the first item in alloc_reqs_by_rp_uuid.get(selected_host.uuid), the selected_alloc_req is always stable, that will cause the values in selections_to_return are same . In fact, it's not right for subsequent operations.
More details you can see: https://etherpad.opendev.org/p/filter_scheduler_issue_with_accelerators |
If a flavor asks for resources that are provided by nested Resource Provider inventories (eg. VGPU) and the user wants multi-create (ie. say --max 2) then the scheduler could be returning a NoValidHosts exception even if each nested Resource Provider can support at least specific instance, if the total wanted capacity is not supported by only one nested RP.
For example, if two children RP have 4 VGPU inventories :
- you can ask for a flavor with 2 VGPU with --max 2
- but you can't ask for a flavor with 4 VGPU and --max 2
======
Original report :
When boot more than one instance with accelerator, and the accelerators are in one compute node, there will be two problems as below:
One problem is as we always get the first item(alloc_reqs[0]) in alloc_reqs, when we iterator the second instance, it will throw conflict exception when putting the allocations.
Another is as we always get the first item in alloc_reqs_by_rp_uuid.get(selected_host.uuid), the selected_alloc_req is always stable, that will cause the values in selections_to_return are same . In fact, it's not right for subsequent operations.
More details you can see: https://etherpad.opendev.org/p/filter_scheduler_issue_with_accelerators |
|