OpenStack Compute (nova)

Cannot delete an instance that failed a previous delete

Bug #1329559 reported by Joe Gordon on 2014-06-13

This bug affects 12 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Critical	Unassigned	OpenStack Compute (nova) 2014.2 "juno"
	Icehouse	Fix Released	Undecided	Unassigned	OpenStack Compute (nova) 2014.1.2

Bug Description

Currently we have a situation where if an instance fails to delete,
instead of having its state reverted, like we do in most places we set
it to error,deleting. This was intentionally done in
https://review.openstack.org/#/c/58829/ . We also intentionally ignore
duplicate requests to delete an instance if its already being deleted
(https://review.openstack.org/#/c/55444/). The combination of these two
things means that if an instance fails to delete for some reason a
tenant is unable to delete that instance.

It turns out this is really bad because instances in deleting state
count against quota, so the tenant slowly looses usable quota.

To fix this, allow duplicate delete calls to go through if the instance
is in error state.

Tags:

Revision history for this message

Joe Gordon (jogo) wrote on 2014-06-13:

Marking as critical because this is hurting openstack-infra in a big way, they have many instances stuck in this state.

Changed in nova:
status:	New → Confirmed
importance:	Undecided → Critical

Revision history for this message

Joe Gordon (jogo) wrote on 2014-06-13:

Patch to fix it: https://review.openstack.org/#/c/99796/

Revision history for this message

Matt Riedemann (mriedem) wrote on 2014-06-16:

This was already reported in bug 1299139 with a lot more details, but that's fine since this has the patch.

tags:	added: api compute
tags:	added: icehouse-backport-potential

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-06-17: Fix merged to nova (master)

Reviewed: https://review.openstack.org/99796
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f33a25a3c40722644c774395b38fd7a7ed0246e1
Submitter: Jenkins
Branch: master

commit f33a25a3c40722644c774395b38fd7a7ed0246e1
Author: Joe Gordon <email address hidden>
Date: Thu Jun 12 16:27:07 2014 -0700

Failure during termination should always leave state as error()

    Currently we have a situation where if an instance fails to delete,
    instead of having its state reverted, like we do in most places we set
    it to error,deleting. This was intentionally done in
    I5fb1bbd56035792f566a6e076edfe7a69df006ef. We also intentionally ignore
    duplicate requests to delete an instance if its already being deleted
    (I2f97f93bd714e0ea3b6d4fa3ac457ab43eed00e1). The combination of these two
    things means that if an instance fails to delete for some reason a
    tenant is unable to delete that instance.

It turns out this is really bad because instances in deleting state
count against quota, so the tenant slowly looses usable quota.

    To fix this, upon a failed termination set the vm_state to error and
    revert the task_state. This is a partial revert of
    I55742203bdd071c7df90902868e46c2020f799bd.

Change-Id: I55742203bdd071c7df90902868e46c2020f799bd
Closes-Bug: #1329559

Changed in nova:
status:	Confirmed → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-06-17: Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/100469

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-06-24: Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/100469
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2f40191d0efa783ca879338578097548ee8c84c0
Submitter: Jenkins
Branch: stable/icehouse

commit 2f40191d0efa783ca879338578097548ee8c84c0
Author: Joe Gordon <email address hidden>
Date: Thu Jun 12 16:27:07 2014 -0700

Failure during termination should always leave state as error()

    Currently we have a situation where if an instance fails to delete,
    instead of having its state reverted, like we do in most places we set
    it to error,deleting. This was intentionally done in
    I5fb1bbd56035792f566a6e076edfe7a69df006ef. We also intentionally ignore
    duplicate requests to delete an instance if its already being deleted
    (I2f97f93bd714e0ea3b6d4fa3ac457ab43eed00e1). The combination of these two
    things means that if an instance fails to delete for some reason a
    tenant is unable to delete that instance.

It turns out this is really bad because instances in deleting state
count against quota, so the tenant slowly looses usable quota.

    To fix this, upon a failed termination set the vm_state to error and
    revert the task_state. This is a partial revert of
    I55742203bdd071c7df90902868e46c2020f799bd.

    Change-Id: I55742203bdd071c7df90902868e46c2020f799bd
    Closes-Bug: #1329559
    (cherry picked from commit f33a25a3c40722644c774395b38fd7a7ed0246e1)

tags:

added: in-stable-icehouse

Russell Bryant (russellb) on 2014-07-23

Changed in nova:
milestone:	none → juno-2
status:	Fix Committed → Fix Released

Chuck Short (zulcss) on 2014-08-07

tags:

removed: icehouse-backport-potential

Thierry Carrez (ttx) on 2014-10-16

Changed in nova:
milestone:	juno-2 → 2014.2

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.