nova-cell, cannot delete VM once deleting VM with failure in nova-compute

Bug #1378683 reported by Rajesh Tailor
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Not able to delete the VM once the VM deletion failed in nova compute.

Steps to reproduce:

1. create VM
2. wait until VM becomes available
3. Stop nova-cell child process
4. delete VM
   nova delete <vm_id or vm_name>
5. stop neutron service
6. start child nova-cell process
7. start neutron service
8. delete VM again

VM is not deleted and will be listed in the nova list output.

$ nova list
+--------------------------------------+------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+------------------+
| 9d7c9fb2-010f-4de6-975a-1a2de825155b | vm09 | ERROR | - | Running | private=10.0.0.2 |
+--------------------------------------+------+--------+------------+-------------+------------------+

Following log messages is logged in n-child-cell screen:

2014-10-07 04:36:57.159 INFO nova.compute.api [req-11c20157-23ac-4892-9fdf-3e60201a9bb4 admin admin]
[instance: 77aabf6c-7b33-4c49-8061-eb9805214085] Instance is already in deleting state, ignoring this request

Note: VM never gets deleted.

Tags: cells
Changed in nova:
assignee: nobody → Rajesh Tailor (rajesh-tailor)
Revision history for this message
Christopher Yeoh (cyeoh-0) wrote :

Won't this get addressed by your existing patch for force delete? https://review.openstack.org/#/c/121800/

Changed in nova:
status: New → Incomplete
Revision history for this message
Rajesh Tailor (rajesh-tailor) wrote :
Download full text (4.7 KiB)

Hi Christopher,

I have tested the issue according to your comment and applied the following patch.
https://review.openstack.org/#/c/121800/

I got following error on nova child cell service (n-cell-child), when I try to delete the instance using force-delete api.

2014-10-09 23:16:13.803 ERROR nova.cells.messaging [req-ca9d0984-612c-4ff2-8
56c-71bfaba70266 admin admin] Error processing message locally: Object '<Ins
tance at 0x7fb3f8b7ced0>' is already attached to session '79' (this is '83')
2014-10-09 23:16:13.803 TRACE nova.cells.messaging Traceback (most recent ca
ll last):
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/n
ova/cells/messaging.py", line 199, in _process_locally
2014-10-09 23:16:13.803 TRACE nova.cells.messaging resp_value = self.msg
_runner._process_message_locally(self)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/n
ova/cells/messaging.py", line 1293, in _process_message_locally
2014-10-09 23:16:13.803 TRACE nova.cells.messaging return fn(message, **
message.method_kwargs)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/n
ova/cells/messaging.py", line 698, in run_compute_api_method
2014-10-09 23:16:13.803 TRACE nova.cells.messaging return fn(message.ctx
t, *args, **method_info['method_kwargs'])
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/n
ova/compute/api.py", line 221, in wrapped
2014-10-09 23:16:13.803 TRACE nova.cells.messaging return func(self, con
text, target, *args, **kwargs)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 211, in inner
2014-10-09 23:16:13.803 TRACE nova.cells.messaging return function(self, context, instance, *args, **kwargs)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 192, in inner
2014-10-09 23:16:13.803 TRACE nova.cells.messaging return f(self, context, instance, *args, **kw)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 1834, in force_delete
2014-10-09 23:16:13.803 TRACE nova.cells.messaging self._delete_instance(context, instance, delete_types.FORCE_DELETE)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 1787, in _delete_instance
2014-10-09 23:16:13.803 TRACE nova.cells.messaging task_state=task_states.DELETING)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 1618, in _delete
2014-10-09 23:16:13.803 TRACE nova.cells.messaging quotas.rollback()
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-10-09 23:16:13.803 TRACE nova.cells.messaging six.reraise(self.type_, self.value, self.tb)
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/compute/api.py", line 1544, in _delete
2014-10-09 23:16:13.803 TRACE nova.cells.messaging instance.save()
2014-10-09 23:16:13.803 TRACE nova.cells.messaging File "/opt/stack/nova/nova/db/sqlalchemy/models.py", line 62, in s...

Read more...

Joe Gordon (jogo)
tags: added: cells
Changed in nova:
status: Incomplete → In Progress
Changed in nova:
importance: Undecided → Low
melanie witt (melwitt)
tags: removed: ntt
Revision history for this message
Kris Lindgren (klindgren) wrote :

So we are running into this. In addition we are running into another vm is stuck in deleting state. In that the vm will fail to delete if a call to neutron failed (such as a call to remove the port or a call to disassociate a floating ip). It seems like a work around for us is to restart nova-compute on the host that has the vm stuck in deleting state. When nova-compute comes back online it logs: Service started deleting the instance during the previous run, but did not finish. Restarting the deletion now.

Seems like an additional change would be to make it retry the delete on every scheduled task interval instead of ignoring it?

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

Cleanup
=======

There are no open reviews for this bug report since more than 2 weeks.
To signal that to other contributors which might provide patches for
this bug, I switch the status from "In Progress" to "Confirmed" and
remove the assignee.
Feel free to add yourself as assignee and to push a review for it.

Changed in nova:
status: In Progress → Confirmed
assignee: Rajesh Tailor (rajesh-tailor) → nobody
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (LIBERTY, MITAKA, OCATA, NEWTON).
  Valid example: CONFIRMED FOR: LIBERTY

Changed in nova:
importance: Low → Undecided
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.