Nodepool favouring precise nodes over f20
Bug #1308407 reported by
Derek Higgins
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Nodepool |
Fix Committed
|
Undecided
|
Unassigned | ||
tripleo |
Fix Released
|
Critical
|
Derek Higgins |
Bug Description
during low demand times of the day nodepool is creating nodes in the ratio that was configured, but for over 1/2 of the day while demand on CI is high it seems to favour precise instance over f20 instances, this causes problems as jobs remain in the zuul queue until all of the precise jobs have finished and only then are f20 instances created, causing a delay in getting results (I've obseverd 10 hour delays in getting a f20 node)
This also causes jobs to be reported as "NOT_REGISTERED" when no f20 nodes are ACTIVE.
To post a comment you must log in.
I've reproduced this locally, it only happens when we are bumping up against max-servers (or presumably the quota).
There is code in nodepool to allocated nodes based on the ratio they were configured for but this doesn't happen correctly when there is demand for more then one server type and only one allocation is available.
In this scenario as nodes become available (1 at a time), and there is demand for more then one type, each new allocation is given to the first node type in the list (which is our case is the precise nodes), the only way we get a new f20 node created is if more then one allocation is freed at the same time, only then does nodepool move to the second node in the list.
I'm thinking a weighted randomiser to randomise the list but favouring requests based on their demand would be a more favourable algorithm.
This would replace the sort that is currently in the code der(object) :
reqs.sort( lambda a, b: cmp(a.getPriori ty(), b.getPriority()))
class AllocationProvi
def makeGrants(self):