Scheduler AZ filter should use cached values of host to AZ mapping
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Undecided
|
Phil Day |
Bug Description
If the scheduler is creating multiple instances in the same request it will re-run each filter for each host for each instance.
In the case of the AZ filter, this currently includes a DB look up to get the AZ value from the aggregate - although this value isn't going to change on successive runs.
Where there are a lot of hosts this can take several seconds for each run of the filter. When repeating this say 100 times (to create 100 instances) the total time taken by the scheduler can exceed the service timeout interval, and so hosts start getting dropped by the compute filter because it looks as if the last service update (cached by the scheduler at the start of the run has now expired).
There seem to be two possible solutions here:
i) Change the AZ filter to use a cached value (Would be simple as the availability zone class already has caching)
ii) Change the filter mechanism so that filters can be defined so that they need only to be run once for each scheduler request
The second approach seems more general
Changed in nova: | |
assignee: | nobody → Phil Day (philip-day) |
tags: | added: scheduler |
Changed in nova: | |
status: | New → In Progress |
Changed in nova: | |
milestone: | none → havana-2 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | havana-2 → 2013.2 |
Phil,
You can implement a caching mechanism in the filters by using https:/ /review. openstack. org/#/c/ 29343/12/ nova/scheduler/ filters/ volume_ affinity_ filter. py as a guide. Filters get initialized for each RPC request coming into the scheduler, so the above patch uses __init__ to help with caching. Furthermore this is a common problem in many filters, so it may be worth it to abstract out the caching code so other filters can re-use it.