It is impossible to delete an instance that has failed due to neutron/nova notification problems
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Critical
|
Matt Riedemann | ||
Icehouse |
Fix Released
|
Critical
|
Ihar Hrachyshka | ||
Juno |
Fix Released
|
Critical
|
Matt Riedemann |
Bug Description
If you attempt to boot a nova instance without Neutron properly configured for neutron/nova notifications, the instance will eventually fail to spawn:
[-] [instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
[instance: 1541a197-
If you try to delete this instance, the delete operation will fail. In the logs, you see:
AUDIT nova.compute.
WARNING nova.virt.
INFO nova.virt.
INFO nova.compute.
At this point, `nova list` will show:
| 1541a197-
And it appears to be impossible to delete this instance. Running "nova reset-state <instance>" has no effect (with or without --active), nor does correctly configuring neutron.
The only way to get rid of this instance appears to be directly editing the database.
Changed in nova: | |
importance: | Undecided → Medium |
status: | New → Triaged |
milestone: | none → kilo-3 |
Changed in nova: | |
importance: | Medium → High |
status: | Triaged → In Progress |
importance: | High → Critical |
tags: | removed: juno-backport-potential |
tags: | removed: icehouse-backport-potential |
Changed in nova: | |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | kilo-3 → 2015.1.0 |
This was actually 'fixed' with bug 1308342 in that you can do a force-delete call on the instance and that will clean it up. However, the fix for bug 1308342 introduced a regression in the cells RPC API, so we have bug 1430822 to fix that.
Once we revert https:/ /review. openstack. org/#/c/ 121800/ then we'll make the fix for this in the compute API to allow force-delete of an instance stuck in 'deleting' task_state to resolve this bug.
Note that we should also have a @revert_task_state decorator in whatever compute manager call failed on spawn so we don't get stuck in this state to begin with.