OpenStack Compute (nova)

Overview
Code
Bugs
Blueprints
Translations
Answers

Series queens
Bug #1719933
Comment #11

Comment 11 for bug 1719933

Revision history for this message

Chris Dent (cdent) wrote on 2018-07-26:

#11

It turns out that allocation write contention on resource provider generation is a relatively significant issue when launching many VMs to one or a very small number of compute nodes, as might happen in a clustered environment like the vmwareapi virtdriver.

The retries handling at https://github.com/openstack/nova/blob/d687e7d29b37b3cdc9e1bc429dec3a01be298f80/nova/scheduler/client/report.py#L103-L123 is insufficient when something like 1200vms are being created because there's always another vm being created concurrently for the same compute node.

Server side retries will help because there will be less latency but they won't fully fix it as the the architecture expects and assumes there will be at least a bit of horizontal diversity in resource providers.