scheduler only sees compute resources freed on periodic updates

Bug #1194900 reported by Peter Feiner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Peter Feiner

Bug Description

When individual instances are updated (e.g., during spawn and terminate), ResourceTracker (in nova.compute.resource_tracker) calls compute_node_update with values=self.compute_node. Since self.compute_node is an instance of ComputeNode that was retrieved from the database, it has updated_at set. Since updated_at is in values, sqlalchemy doesn't automatically change the record's updated_at column (see nova.openstack.common.db.sqlalchemy.models.TimestampMixin). Moreover, since updated_at is set to the last value's updated_at, updated_at effectively doesn't change until values without updated_at are sent, which only happens during the periodic task that calls ResourceTracker.update_available_resources.

Nova-scheduler relies on ComputeNode.updated_at to keep its model of available resources up-to-date. In particular, nova-scheduler doesn't play a role in instance termination, so it doesn't account for freed resources until ComputeNode.updated_at changes. Thus, between nova compute's periodic calls to ResourceTracker.update_available_resources, nova-scheduler's model of available resources monotonically decreases. If, for example, a node has resources for 10 instances, and you manage to boot 10, terminate 10, then attempt to boot another before the end of the period, nova-scheduler won't schedule the new instance on the vacant node.

Note that f398b9e195cda582bad57396b097dec274384c07 fixed a separate issue (bug #1153778) related to ComputeNode.update_at being stale.

Peter Feiner (pete5)
description: updated
Changed in nova:
assignee: nobody → Peter Feiner (pete5)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/33853
Committed: http://github.com/openstack/nova/commit/0ed62fb7affbda4a701c2175e95aa6f92038604c
Submitter: Jenkins
Branch: master

commit 0ed62fb7affbda4a701c2175e95aa6f92038604c
Author: Peter Feiner <email address hidden>
Date: Wed Jun 19 21:14:43 2013 +0000

    db.compute_node_update: ignore values['update_at']

    When individual instances are updated (e.g., during spawn and
    terminate), ResourceTracker (in nova.compute.resource_tracker) calls
    compute_node_update with values=self.compute_node. Since
    self.compute_node is an instance of ComputeNode that was retrieved
    from the database, it has updated_at set. Since updated_at is in
    values, sqlalchemy doesn't automatically change the record's
    updated_at column (see
    nova.openstack.common.db.sqlalchemy.models.TimestampMixin). Moreover,
    since updated_at is set to the last value's updated_at, updated_at
    effectively doesn't change until values without updated_at are sent,
    which only happens during the periodic task that calls
    ResourceTracker.update_available_resources.

    Nova-scheduler relies on ComputeNode.updated_at to keep its model of
    available resources up-to-date. In particular, nova-scheduler doesn't
    play a role in instance termination, so it doesn't account for freed
    resources until ComputeNode.updated_at changes. Thus, between
    nova-compute's periodic calls to
    ResourceTracker.update_available_resources, nova-scheduler's model of
    available resources monotonically decreases. If, for example, a node
    has resources for 10 instances, and you manage to boot 10, terminate
    10, then attempt to boot another before the end of the period,
    nova-scheduler won't schedule the new instance on the vacant node.

    Fixes bug #1194900.

    Note that f398b9e195cda582bad57396b097dec274384c07 fixed a separate
    issue (bug #1153778) related to ComputeNode.update_at being stale.

    Change-Id: Ifd1e56edfd811241816970715071876857de80d3

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-2 → 2013.2
Changed in nova:
importance: Undecided → High
tags: added: grizzly-backport-potential
Alan Pevec (apevec)
tags: removed: grizzly-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.