nova force-delete does not delete "BUILD" state instances

Bug #1351001 reported by Stephen Gordon
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Christopher Yeoh

Bug Description

Description of problem:

Using nova force-delete $instance-id fails when an instance is in status "BUILD" and OS-EXT-STS:task_state "deleting". However, "nova delete" does seem to work after several tries.

Version-Release number of selected component (if applicable):

2013.2 (Havana)

How reproducible:

Steps to Reproduce:
1. find a seemingly hung instance
2. fire off nova-delete
3. watch it complain

Actual results:

[root@host02 ~(keystone_admin)]$ nova force-delete 3a83b712-4667-44c1-a83d-ada164ff78d1
ERROR: Cannot 'forceDelete' while instance is in vm_state building (HTTP 409) (Request-ID: req-22737c83-32f4-4c6d-ae9c-09a542556907)

Expected results:

do the needful.

Additional info:

Here are some logs obtained from this behavior, this is on RHOS4 / RHEL6.5:

--snip--

[root@host02 ~(keystone_admin)]$ nova force-delete 3a83b712-4667-44c1-a83d-ada164ff78d1
ERROR: Cannot 'forceDelete' while instance is in vm_state building (HTTP 409) (Request-ID: req-22737c83-32f4-4c6d-ae9c-09a542556907)
[root@host02 ~(keystone_admin)]$ nova list --all-tenants | grep 3a83b712-4667-44c1-a83d-ada164ff78d1
| 3a83b712-4667-44c1-a83d-ada164ff78d1 | bcrochet-foreman | BUILD | deleting | NOSTATE | default=192.168.87.7; foreman_int=192.168.200.6; foreman_ext=192.168.201.6

[root@host02 ~(keystone_admin)]$ nova show 3a83b712-4667-44c1-a83d-ada164ff78d1
+--------------------------------------+----------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------------------------+
| status | BUILD |
| updated | 2014-04-16T20:56:44Z |
| OS-EXT-STS:task_state | deleting |
| OS-EXT-SRV-ATTR:host | host08.oslab.priv |
| foreman_ext network | 192.168.201.6 |
| key_name | foreman-ci |
| image | rhel-guest-image-6-6.5-20140116.1-1 (253354e7-8d65-4d95-b134-6b423d125579) |
| hostId | 4b98ba395063916c15f5b96a791683fa5d116109987c6a6b0b8de2f1 |
| OS-EXT-STS:vm_state | building |
| OS-EXT-SRV-ATTR:instance_name | instance-000099e9 |
| foreman_int network | 192.168.200.6 |
| OS-SRV-USG:launched_at | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | host08.oslab.priv |
| flavor | m1.large (4) |
| id | 3a83b712-4667-44c1-a83d-ada164ff78d1 |
| security_groups | [{u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}] |
| OS-SRV-USG:terminated_at | None |
| user_id | 13090770bacc46ccb8fb7f5e13e5de98 |
| name | bcrochet-foreman |
| created | 2014-04-16T20:27:51Z |
| tenant_id | f8e6ba11caa94ea98d24ec819eb746fd |
| OS-DCF:diskConfig | MANUAL |
| metadata | {} |
| os-extended-volumes:volumes_attached | [] |
| accessIPv4 | |
| accessIPv6 | |
| progress | 0 |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-AZ:availability_zone | nova |
| default network | 192.168.87.7 |
| config_drive | |
+--------------------------------------+----------------------------------------------------------------------------+

[root@host08 ~]# virsh list --all | grep instance-000099e9
[root@host08 ~]#

[root@host02 ~(keystone_admin)]$ nova delete 3a83b712-4667-44c1-a83d-ada164ff78d1
[root@host02 ~(keystone_admin)]$ nova list --all-tenants | grep 3a83b712-4667-44c1-a83d-ada164ff78d1
| 3a83b712-4667-44c1-a83d-ada164ff78d1 | bcrochet-foreman | BUILD | deleting | NOSTATE | default=192.168.87.7; foreman_int=192.168.200.6; foreman_ext=192.168.201.6

** insert several more "nova delete" calls **

( back on compute host08 after API called for delete..)

2014-04-22 12:46:21.253 8427 INFO nova.virt.libvirt.driver [-] [instance: e9f49326-7703-4035-b3fe-57b4e11bb07b] Instance destroyed successfully.
2014-04-22 12:46:21.668 8427 INFO nova.compute.manager [-] [instance: e9f49326-7703-4035-b3fe-57b4e11bb07b] During sync_power_state the instance has a pending task. Skip.
2014-04-22 12:46:21.701 8427 INFO nova.virt.libvirt.driver [req-dddc2bcf-539a-48d9-a870-c5cfcccb6916 54b9f83693c84bf2b72286e9609eee36 210a99a1e68f43218f4cab705c908d45] [instance: e9f49326-7703-4035-b3fe-57b4e11bb07b] Deleting instance files /var/lib/nova/instances/e9f49326-7703-4035-b3fe-57b4e11bb07b
2014-04-22 12:46:21.702 8427 INFO nova.virt.libvirt.driver [req-dddc2bcf-539a-48d9-a870-c5cfcccb6916 54b9f83693c84bf2b72286e9609eee36 210a99a1e68f43218f4cab705c908d45] [instance: e9f49326-7703-4035-b3fe-57b4e11bb07b] Deletion of /var/lib/nova/instances/e9f49326-7703-4035-b3fe-57b4e11bb07b complete

-- snip --

Of relative importance, this particular instance did not show up in virsh but I could see it in the qemu-kvm process table:

( tracking instance down to host08 )
[root@host02 ~(keystone_admin)]$ cat /var/log/nova/scheduler.log | grep -i 3a83b712-4667-44c1-a83d-ada164ff78d1
2014-04-16 20:27:52.243 28987 INFO nova.scheduler.filter_scheduler [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] Attempting to build 1 instance(s) uuids: [u'3a83b712-4667-44c1-a83d-ada164ff78d1']
2014-04-16 20:27:52.462 28987 INFO nova.scheduler.filter_scheduler [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] Choosing host WeighedHost [host: host08.oslab.priv, weight: 15249.0] for instance 3a83b712-4667-44c1-a83d-ada164ff78d1

( see successful build on host08 nova compute)

[root@host08 ~]# cat /var/log/nova/compute.log | grep 3a83b712-4667-44c1-a83d-ada164ff78d1
2014-04-16 20:27:52.708 8427 AUDIT nova.compute.manager [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Starting instance...
2014-04-16 20:27:53.836 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Attempting claim: memory 8192 MB, disk 80 GB, VCPUs 4
2014-04-16 20:27:53.837 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Total Memory: 64401 MB, used: 49152.00 MB
2014-04-16 20:27:53.837 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Memory limit: 96601.50 MB, free: 47449.50 MB
2014-04-16 20:27:53.838 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Total Disk: 842 GB, used: 471.00 GB
2014-04-16 20:27:53.838 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Disk limit not specified, defaulting to unlimited
2014-04-16 20:27:53.839 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Total CPU: 16 VCPUs, used: 25.00 VCPUs
2014-04-16 20:27:53.840 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] CPU limit not specified, defaulting to unlimited
2014-04-16 20:27:53.840 8427 AUDIT nova.compute.claims [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Claim successful
2014-04-16 20:27:54.812 8427 INFO nova.virt.libvirt.driver [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Creating image
2014-04-16 20:27:58.683 8427 INFO nova.virt.libvirt.driver [req-5e3af99e-658e-45cc-86e0-e64f462ff5ab 13090770bacc46ccb8fb7f5e13e5de98 f8e6ba11caa94ea98d24ec819eb746fd] [instance: 3a83b712-4667-44c1-a83d-ada164ff78d1] Injecting key into image 253354e7-8d65-4d95-b134-6b423d125579

( shows up in process table )

[root@host08 ~(keystone_admin)]$ ps -ef | grep instance | grep 3a83b712-4667-44c1-a83d
nova 22510 8427 0 Apr16 ? 00:04:30 /usr/libexec/qemu-kvm -global virtio-blk-pci.scsi=off -nodefconfig -nodefaults -nographic -machine accel=kvm:tcg -cpu host,+kvmclock -m 500 -no-reboot -kernel /var/tmp/.guestfs-162/kernel.8427 -initrd /var/tmp/.guestfs-162/initrd.8427 -device virtio-scsi-pci,id=scsi -drive file=/var/lib/nova/instances/3a83b712-4667-44c1-a83d-ada164ff78d1/disk,cache=none,format=qcow2,id=hd0,if=none -device scsi-hd,drive=hd0 -drive file=/var/tmp/.guestfs-162/root.8427,snapshot=on,id=appliance,if=none,cache=unsafe -device scsi-hd,drive=appliance -device virtio-serial -serial stdio -device sga -chardev socket,path=/tmp/libguestfss0eEyk/guestfsd.sock,id=channel0 -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 -append panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 TERM=linux

( but virsh does not know about it )

[root@host08 ~]# virsh list --all | grep instance-000099e9
[root@host08 ~]#
[root@host08 ~]# for x in `virsh list --all | grep -v Name | awk '{print $2}'`; do virsh dumpxml $x | grep 3a83b712-4667-44c1-a83d-ada164ff78d1; done
[root@host08 ~]#

Revision history for this message
Christopher Yeoh (cyeoh-0) wrote :

So perhaps this is a documentation/nomenclature issue

nova delete is for deleting a server
nova force-delete is for deleting an server which has previously been deleted but not yet reclaimed

Revision history for this message
Stephen Gordon (sgordon) wrote :

Looking at the CLI Reference Guide [1][2] the documentation is currently as follows (note that it is automatically generated from the strings in the code so needs to be fixed in the source):

"""
nova delete command

usage: nova delete <server> [<server> ...]

Immediately shut down and delete specified server(s).

Positional arguments

<server>

    Name or ID of server(s).
"""

"""
nova force-delete command

usage: nova force-delete <server>

Force delete a server.

Positional arguments

<server>

    Name or ID of server.
"""

I don't think this highlights the nuance you note very clearly.

[1] http://docs.openstack.org/cli-reference/content/novaclient_commands.html#novaclient_subcommand_delete
[2] http://docs.openstack.org/cli-reference/content/novaclient_commands.html#novaclient_subcommand_force-delete

Revision history for this message
Christopher Yeoh (cyeoh-0) wrote :

Yea the documentation public documentation is definitely inconsistent with what it actually does and I think the original intent of force delete from the API point of view eg:

in the extension itself (which is called actually called deferred delete):

    def _force_delete(self, req, id, body):
        """Force delete of instance before deferred cleanup."""

and in the compute code:

    def force_delete(self, context, instance):
        """Force delete a previously deleted (but not reclaimed) instance."""
        self._delete_instance(context, instance)

So I think there was when the Nova code was written (as opposed to the python-novaclient and the associated documentation) that force delete was there where someone wanted to "hurry up" the deferred delete process of Nova, rather than actually be used to delete a running instance

However having thought about it a bit, I think it does make sense that force delete could delete a running an instance and not just one that is deferred delete state. It is sort of an API change so we'll see what others think - it might need to be deferred to the next version of the API.

Changed in nova:
status: New → Confirmed
assignee: nobody → Christopher Yeoh (cyeoh-0)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/111157

Changed in nova:
status: Confirmed → In Progress
Jay Pipes (jaypipes)
Changed in nova:
importance: Undecided → Medium
Changed in nova:
assignee: Christopher Yeoh (cyeoh-0) → Alex Xu (xuhj)
Changed in nova:
milestone: none → juno-rc1
assignee: Alex Xu (xuhj) → Christopher Yeoh (cyeoh-0)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/111157
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eb19c346b5fd8c2cb10f50a32f689e88a6a79d92
Submitter: Jenkins
Branch: master

commit eb19c346b5fd8c2cb10f50a32f689e88a6a79d92
Author: Chris Yeoh <email address hidden>
Date: Fri Aug 1 14:14:55 2014 +0930

    Allow forceDelete to delete running instances

    Currently attempting to force delete a running instance
    fails while a normal delete can succeed. This changes the
    behaviour of force_delete so any instance can be deleted
    and not just ones which have already been soft deleted.

    Also catch InstanceIsLocked exceptions at the API layer
    and now also don't need to catch InstanceInvalidState
    exceptions since they are no longer raised

    DocImpact: forceDelete will now delete any instance and not
      just ones which have already been soft deleted

    Change-Id: I06a476701ccb47d8f7fd6e43c394ccb3fac2b3aa
    Closes-Bug: 1351001

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
Revision history for this message
Rupesh Chowdhary (rupeshrams) wrote :

 That could be issue with Rabbitmq connectivity. Check the nova-api logs and search with error keyword for better understanding. If the VM state in build Compute connectivity dropping is continue because of the Rabbit mq,

ERROR oslo_messaging._drivers.impl_rabbit [-] AMQP

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.