trusted_filter cache does not work

Bug #1223450 reported by Bob Ball
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Hans Lindgren

Bug Description

The cache in trusted_filter does not work because a new instance of TrustedFilter is made each time, which creates it's own cache.

This means that each time a new instance is scheduled, the TrustedFilter will query the attestation service for all hosts - which is a very slow operation.

Tags: scheduler
tags: added: scheduler
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Guangya Liu (Jay Lau) (jay-lau-513) wrote :
wanghong (w-wanghong)
Changed in nova:
assignee: nobody → wanghong (w-wanghong)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/64498

Changed in nova:
status: Confirmed → In Progress
wanghong (w-wanghong)
Changed in nova:
assignee: wanghong (w-wanghong) → nobody
status: In Progress → Invalid
Revision history for this message
Bob Ball (bob-ball) wrote :

Why has this been changed to Invalid?

I believe this is still valid unless the logic in how the scheduler creates filters has changed?

Revision history for this message
Bob Ball (bob-ball) wrote :

Sorry - I've now seen the comments on https://review.openstack.org/#/c/64498/ - however, I think the point stands; the cache does not work currently.

Revision history for this message
Bob Ball (bob-ball) wrote :

Set back to confirmed pending the output of the comments on the review; I just don't want to forget this bug at the moment and I think there is more discussion to be had.

Changed in nova:
status: Invalid → Confirmed
Revision history for this message
Dave McCowan (dave-mccowan) wrote :

I believe the comments indicate that the cache is working as designed. The cache saves the trust state of all of the hosts for the duration of scheduling a single VM. However, as Bob reports, the attestation server can take a long time to respond. I see delays around 10 seconds and I only have 2 hosts in my configuration. If I start 10 VMs at the same time, this adds at least 100 seconds to scheduling time. It would be a great performance boost if all VMs starting around the same time could share a cache.

Revision history for this message
jiang, yunhong (yunhong-jiang) wrote :

Dave, thanks for your input and your data.

Some thought:

a) I'm surprised to see 10 seconds for only 2 hosts. Which attestation server are you using?
b) I think with the local cache, it will not take 100 seconds to start VM at the same time if they are started in batch.
c) I agree that a global cache should help, I will comments on the patch.

Thanks
--jyh

Revision history for this message
Bob Ball (bob-ball) wrote :

I suggest we keep the discussions on here rather than the patch since the patch is abandoned and thereofre email notifcations etc seem to be disabled.

a) There was a bug with the TrustedFilter implementation (I believe now fixed) that would add both the compute and domU on XenServer hosts to the list of hosts to ask Mt Wilson about; the response from Mt Wilson took a long time although I haven't been playing with the environment recently enough to comment more on the length of time.

b) Since each scheduled VM start has it's own TrustedFilter the cache is completely useless for multi-VM starts. The filter works on class names until it gets through to the 'get_filtered_objects' method which is called once for each VM to be scheduled since we have filter_properties there; as such each class is instantiated for each VM schedule. We _always_ get a new cache because the cache is on ComputeAttestation which is initialised when TrustedFilter is initialised and there are no global singletons.

Surely since the cache has a timeout it can very safely be global since we will never trust the cache contents once the timeout has been reached.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/132229

Changed in nova:
assignee: nobody → Hans Lindgren (hanlind)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/132229
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c126d36640e0398e76ba01783b7f21f01f53a5f5
Submitter: Jenkins
Branch: master

commit c126d36640e0398e76ba01783b7f21f01f53a5f5
Author: Hans Lindgren <email address hidden>
Date: Thu Oct 23 19:42:35 2014 +0200

    Make scheduler filters/weighers only load once

    Right now, filters/weighers are instantiated on every invocation of the
    scheduler. This is both time consuming and unnecessary. In cases where
    a filter/weigher tries to be smart and store/cache something in between
    invocations this actually prohibits that.

    This change make base filter/weigher functions take objects instead of
    classes and then let schedulers create objects only once and then reuse
    them.

    This fixes a known bug in trusted_filter that tries to cache things.

    Related to blueprint scheduler-optimization

    Change-Id: I3174ab7968b51c43c0711033bac5d4bc30938b95
    Closes-Bug: #1223450

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.