schedulers should set instance to error state

Bug #886289 reported by Chris Behrens on 2011-11-04
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Kevin L. Mitchell

Bug Description

A recent change went in where compute/api will create instance DB entry directly if zone routing is off... instead of waiting for the scheduler to do it. So now, if the scheduler raises NoValidHost or errors out in some way... it needs to make sure to set the vm_state on the instance to ERROR... otherwise it's left in BUILD state forever.

Something like this is needed (Note this is untested... and this example is for distributed_scheduler. There's probably other places that this needs to happen... other schedulers, etc):

--- a/nova/scheduler/
+++ b/nova/scheduler/
@@ -36,6 +36,7 @@ from nova import log as logging
 from nova import rpc

 from nova.compute import api as compute_api
+from nova.compute import vm_states
 from nova.scheduler import api
 from nova.scheduler import driver
 from nova.scheduler import filters
@@ -99,6 +100,8 @@ class DistributedScheduler(driver.Scheduler):
                                         *args, **kwargs)

         if not weighted_hosts:
+ # This is how you can tell compute/api created the instance
+ if request_spec.get('id'):
+ db.instance_update(context, request_spec['id'], {vm_state=vm_states.ERROR})
             raise driver.NoValidHost(_('No hosts were available'))

Chris Behrens (cbehrens) on 2011-11-04
summary: - distributed_scheduler should set instance to error state
+ schedulers should set instance to error state
description: updated
Thierry Carrez (ttx) on 2011-11-08
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Changed in nova:
assignee: nobody → Kevin L. Mitchell (klmitch)

Submitter: Jenkins
Branch: master

 status fixcommitted

commit 21e08712d9ac5577c27e7ea4c9271372bc0bd3ed
Author: Kevin L. Mitchell <email address hidden>
Date: Mon Nov 21 14:39:22 2011 -0600

    Put instances in ERROR state when scheduler fails.

    When the scheduler's selected driver method raises an exception, such
    as NoValidHost, any affected instance must be placed into the ERROR
    state. This is done by catching exceptions raised in _schedule() and,
    if 'instance_id' is present in kwargs, moving the identified instance
    to the ERROR state. This fixes bug 886289.

    Change-Id: I5c73549e073493701b86658569823b9bc161291d

Changed in nova:
status: Confirmed → Fix Committed
Thierry Carrez (ttx) on 2011-12-14
Changed in nova:
milestone: none → essex-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2012-04-05
Changed in nova:
milestone: essex-2 → 2012.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers