some exceptions raised in terminate_instance() wedge the instance in the 'deleting' state

Bug #1212420 reported by Māris Fogels
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Māris Fogels

Bug Description

Some exceptions raised by _delete_instance() in nova-compute's manager cause terminate_instance() to leave the instance task as 'deleting'. The task entry keeps nova-compute from noticing further power state transitions via _sync_power_states(), wedging the virtual machine in the 'deleting' state.

This bug affects the nova-compute manager module: https://github.com/openstack/nova/blob/master/nova/compute/manager.py

Bug 1177584 may be an example of this in action:

 * terminate_instance() calls the baremetal driver to destroy an instance
 * a slow system call in the baremetal driver.destroy() operation raises an InstancePowerOffFailure exception
 * the exception bubbles back up to terminate_instance(), which ignores it
 * the instance is left in power_state.RUNNING, vm_state.ACTIVE, task_state.DELETING
 * _sync_power_states() ignores further power state changes because of the active task

This could happen with any virtual machine driver that raises an exception during a call to its driver.destroy() method.

It's worth noting that keeping the task as-is after exceptions in terminate_instance() was done on purpose to resolve bug 1046236 (see the terminate_instance() source for details).

Reproducing this bug is difficult: terminate_instance() handles some exceptions and not others, and the same applies to the try/except blocks in _delete_instance() and _shutdown_instance(). However, it is possible to reproduce this bug in a unit test that mocks out _shudown_instance() and raises an InstancePowerOffFailure. A poisoned driver that injects an InstancePowerOffFailure would also work.

Reported against nova master: commit 8fb450fb3aa033d42c5dddb907392efd70f54a6b

Māris Fogels (mars)
description: updated
description: updated
Matt Riedemann (mriedem)
tags: added: compute
aeva black (tenbrae)
tags: added: baremetal
Māris Fogels (mars)
Changed in nova:
assignee: nobody → Māris Fogels (mars)
Māris Fogels (mars)
Changed in nova:
status: New → In Progress
Revision history for this message
Māris Fogels (mars) wrote :

A fix for this was reviewed in https://review.openstack.org/43528 and landed in https://github.com/openstack/nova/commit/1e8de59d250eb8374f977e8008386abe9e7ea3db.

I'm marking this 'Fix Committed' for now. Waiting for feedback on this fix from bug 1177584.

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.