Failure to power off a VM during delete leads to it going back to Active(None)

Bug #1254122 reported by Phil Day
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Phil Day

Bug Description

If libvirt fails to power off a VM during the shutdown called from within do_terminate_instance() it raises exception.InstancePowerOffFailure. However do_terminate_instance doesn’t catch this, and so two (IMO bad) things happen:

i) The instance doesn’t go to an Error (Deleting) state
ii) @reverts_task_state sets task_state to None – putting the instance into Active(None)

This makes the user think the system has just ignored their request, so the repeate the delete, and repeate the delete ….

Proposed changes:
- Catch exception.InstancePowerOffFailure so that the instance goes to Error(Deleting)
- Remove the reverts_task_state decorator from terminate_instance. Delete is a non-reversible operation for the user, and many systems stop billing for instances once the user has indicated that they want to delete the instance (from that point on is the system’s problem to complete the delete as soon as possible. No failure during delete should set the instance back to Active(None)

Aditi Raveesh (aditirav)
Changed in nova:
assignee: nobody → Aditi Raveesh (aditirav)
Revision history for this message
Phil Day (philip-day) wrote :

Sorry Aditi - I already have a fix proposed for this: https://review.openstack.org/#/c/58829/

somehow it didn't get linked back to the bug

Changed in nova:
assignee: Aditi Raveesh (aditirav) → Phil Day (philip-day)
Aditi Raveesh (aditirav)
Changed in nova:
status: New → Confirmed
Phil Day (philip-day)
Changed in nova:
status: Confirmed → In Progress
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/58829
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=63381a15ac3c2e36f5521f8a77108664f89bfab5
Submitter: Jenkins
Branch: master

commit 63381a15ac3c2e36f5521f8a77108664f89bfab5
Author: Phil Day <email address hidden>
Date: Wed Nov 27 17:31:23 2013 +0000

    Failure during termination should always leave state as Error(Deleting)

    Delete is a non-reversible operation for the user, and once the
    user has indicated that they want to delete the instance from that
    point on is the system's problem to complete the delete as soon as
    possible.

    If anything fails during the delete that the system cannot recover
    from then the instance should be left in an Error(Deleting) state.
    Anything else, in particular reverting to an Active(None) state, makes
    it look like the system has ignored the request.

    Currently InstanceTerminationFailure is explicitly caught, but the
    exception does not propagate so the instance_fault wrapper does not
    get a chance to log the failure. Also terminate_instance is wrapped
    by reverts_task_state which resets the state to Active(None)

    This change removes the revert_task_state wrapper and catches all
    exceptions so that unhandled exceptions always leave the instance
    in Error(Deleting)

    Change-Id: I5fb1bbd56035792f566a6e076edfe7a69df006ef
    Closes-Bug: 1254122

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-3
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.