Scheduler, performance impact when dealing with aggregates

Bug #1300775 reported by Sahid Orentino
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

During a scheduling if we use filter that needs to get data from aggregates like CoreFilterAggregate, RamFilterAggregate does.
The filter retrieves metadata from the database for every host and can creates a performance impact if we have several hosts.

Tags: scheduler
Changed in nova:
assignee: nobody → sahid (sahid-ferdjaoui)
description: updated
tags: added: scheduler
Changed in nova:
importance: Undecided → Medium
Revision history for this message
John Garbutt (johngarbutt) wrote :

Ideally lets fix this in a way that means that caching scheduler would cache this DB query.

So that probably means put it inside the host_manager, but see how that looks, it might be the wrong place/too deep

Changed in nova:
status: New → Triaged
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

IMHO, we should at first make use of the conductor for placing a call to the DB.

On a long-term view, that's something possibly needing to be cached thanks to the no-db scheduler (even if it's not HostState).

Revision history for this message
John Garbutt (johngarbutt) wrote :

hmm, after the scheduler split, this should probably all come from NodeStats, add some location information in there... oh dear, I think we need the more complex fix here.

Revision history for this message
John Garbutt (johngarbutt) wrote :

So, the bigger fix, write aggregate info as a stat into the ComputeNode table, then use that to do the filtering.

This removes the DB query in the filter, and also ensures it works after the scheduler is removed from Nova.

Revision history for this message
John Garbutt (johngarbutt) wrote :

Hmm, maybe this should go through the blueprint process, to review the direction... Not 100% sure.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Well, as said in IRC, I agree that ResourceTracker should log in ComputeNode which aggregates the host is in.

Revision history for this message
jiang, yunhong (yunhong-jiang) wrote :

I also prefer to keep this in compute node, but I don't understand why it's a must to keep this in resource tracker after the split? I think even after the split, the scheduler may still need to support host aggregate, right?

Are there anyone working on a BP for it?

Changed in nova:
assignee: sahid (sahid-ferdjaoui) → nobody
Revision history for this message
wingwj (wingwj) wrote :

I saw the discussion in Summit here https://etherpad.openstack.org/p/juno-nova-no-db-scheduler.

So if we add the compute cache in scheduler, is the issue solved?

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :
Changed in nova:
assignee: nobody → Sylvain Bauza (sylvain-bauza)
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

As there will be work on how the Scheduler is getting updates from aggregates during the Kilo release, I'm holding this bug until the former is merged.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Sylvain, if you are picking this up, please re-assign it to yourself. thanks.

Changed in nova:
assignee: Sylvain Bauza (sylvain-bauza) → nobody
Sean Dague (sdague)
Changed in nova:
status: Triaged → Confirmed
Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

We can probably close that one

  https://review.openstack.org/#/c/159904/

Changed in nova:
status: Confirmed → Fix Committed
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

I would rather mark it as invalid now that the design changed.

I really would like to keep fix committed to match with Gerrit changes

Changed in nova:
status: Fix Committed → Invalid
Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Invalid is not the good status IMHO - Probably make more sense to affiliate it to the blueprint related since this bug was the start of thinking the problem, then won't fix or fixed seem to be better I mean for an history point of view. But anyway...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.