cleanup_running_deleted_instances peroidic task failed with instance not found

Bug #1280140 reported by wangpan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
wangpan
Havana
Fix Released
Undecided
wangpan

Bug Description

this is because the db query is not including the deleted instance while
_delete_instance_files() in libvirt driver.

I can reproduce this bug both in master and stable havana.
reproduce steps:
1. create an instance
2. stop nova-compute
3. wait for nova-manage serivce list display xxx of nova-compute
4. modify the running_deleted_instance_poll_interval=60
running_deleted_instance_action = reap,
and start nova-compute and wait for this clean up peroidic task
5. a warnning will be given in the compute log:
2014-02-14 16:22:25.917 WARNING nova.compute.manager [-] [instance: c32db267-21a0-41e7-9d50-931d8396d8cb] Periodic cleanup failed to delete instance: Instance c32db267-21a0-41e7-9d50-931d8396d8cb could not be found.

the debug trace is:
ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(1006)cleanup()
   1005 block_device_mapping = driver.block_device_info_get_mapping(
-> 1006 block_device_info)
   1007 for vol in block_device_mapping:

ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(1007)cleanup()
   1006 block_device_info)
-> 1007 for vol in block_device_mapping:
   1008 connection_info = vol['connection_info']

ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(1041)cleanup()
   1040
-> 1041 if destroy_disks:
   1042 self._delete_instance_files(instance)

ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(1042)cleanup()
   1041 if destroy_disks:
-> 1042 self._delete_instance_files(instance)
   1043

ipdb> s
--Call--
> /opt/stack/nova/nova/virt/libvirt/driver.py(4950)_delete_instance_files()
   4949
-> 4950 def _delete_instance_files(self, instance):
   4951 # NOTE(mikal): a shim to handle this file not using instance objects

ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(4953)_delete_instance_files()
   4952 # everywhere. Remove this when that conversion happens.

-> 4953 context = nova_context.get_admin_context()
   4954 inst_obj = instance_obj.Instance.get_by_uuid(context, instance['uuid'])

ipdb> n
> /opt/stack/nova/nova/virt/libvirt/driver.py(4954)_delete_instance_files()
   4953 context = nova_context.get_admin_context()
-> 4954 inst_obj = instance_obj.Instance.get_by_uuid(context, instance['uuid'])
   4955

ipdb> n
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/virt/libvirt/driver.py(4954)_delete_instance_files()
   4953 context = nova_context.get_admin_context()
-> 4954 inst_obj = instance_obj.Instance.get_by_uuid(context, instance['uuid'])
   4955

ipdb> n
--Return--
None
> /opt/stack/nova/nova/virt/libvirt/driver.py(4954)_delete_instance_files()
   4953 context = nova_context.get_admin_context()
-> 4954 inst_obj = instance_obj.Instance.get_by_uuid(context, instance['uuid'])
   4955

ipdb> n
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/virt/libvirt/driver.py(1042)cleanup()
   1041 if destroy_disks:
-> 1042 self._delete_instance_files(instance)
   1043

ipdb> n
--Return--
None
> /opt/stack/nova/nova/virt/libvirt/driver.py(1042)cleanup()
   1041 if destroy_disks:
-> 1042 self._delete_instance_files(instance)
   1043

ipdb> n
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/virt/libvirt/driver.py(931)destroy()
    930 self.cleanup(context, instance, network_info, block_device_info,
--> 931 destroy_disks)
    932

ipdb> n
--Return--
None
> /opt/stack/nova/nova/virt/libvirt/driver.py(931)destroy()
    930 self.cleanup(context, instance, network_info, block_device_info,
--> 931 destroy_disks)
    932

ipdb> n
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/compute/manager.py(1905)_shutdown_instance()
   1904 self.driver.destroy(context, instance, network_info,
-> 1905 block_device_info)
   1906 except exception.InstancePowerOffFailure:

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(1906)_shutdown_instance()
   1905 block_device_info)
-> 1906 except exception.InstancePowerOffFailure:
   1907 # if the instance can't power off, don't release the ip

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(1910)_shutdown_instance()
   1909 pass
-> 1910 except Exception:
   1911 with excutils.save_and_reraise_exception():

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(1911)_shutdown_instance()
   1910 except Exception:
-> 1911 with excutils.save_and_reraise_exception():
   1912 # deallocate ip and fail without proceeding to

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(1914)_shutdown_instance()
   1913 # volume api calls, preserving current behavior

-> 1914 self._try_deallocate_network(context, instance,
   1915 requested_networks)

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(1915)_shutdown_instance()
   1914 self._try_deallocate_network(context, instance,
-> 1915 requested_networks)
   1916

ipdb> n
2014-02-14 16:22:02.701 DEBUG nova.compute.manager [-] [instance: c32db267-21a0-41e7-9d50-931d8396d8cb] Deallocating network for instance from (pid=19192) _deallocate_network /opt/stack/nova/nova/compute/manager.py:1531
2014-02-14 16:22:02.704 DEBUG oslo.messaging._drivers.amqpdriver [-] MSG_ID is e529a4edb22b480cb0641a62718e9b04 from (pid=19192) _send /opt/stack/oslo.messaging/oslo/messaging/_drivers/amqpdriver.py:358
2014-02-14 16:22:02.705 DEBUG oslo.messaging._drivers.amqp [-] UNIQUE_ID is aed682f94730441aaa14e43a37c86227. from (pid=19192) _add_unique_id /opt/stack/oslo.messaging/oslo/messaging/_drivers/amqp.py:333
2014-02-14 16:22:02.718 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 11.632922 sec
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/compute/manager.py(1915)_shutdown_instance()
   1914 self._try_deallocate_network(context, instance,
-> 1915 requested_networks)
   1916

ipdb> n
--Return--
None
> /opt/stack/nova/nova/compute/manager.py(1915)_shutdown_instance()
   1914 self._try_deallocate_network(context, instance,
-> 1915 requested_networks)
   1916

ipdb> l
   1910 except Exception:
   1911 with excutils.save_and_reraise_exception():
   1912 # deallocate ip and fail without proceeding to

   1913 # volume api calls, preserving current behavior

   1914 self._try_deallocate_network(context, instance,
-> 1915 requested_networks)
   1916
   1917 self._try_deallocate_network(context, instance, requested_networks)
   1918
   1919 for bdm in vol_bdms:
   1920 try:

ipdb> n
InstanceNotFound: Instance...found.',)
> /opt/stack/nova/nova/compute/manager.py(5225)_cleanup_running_deleted_instances()
   5224 self._shutdown_instance(context, instance, bdms,
-> 5225 notify=False)
   5226 self._cleanup_volumes(context, instance['uuid'], bdms)

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(5227)_cleanup_running_deleted_instances()
   5226 self._cleanup_volumes(context, instance['uuid'], bdms)
-> 5227 except Exception as e:
   5228 LOG.warning(_("Periodic cleanup failed to delete "

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(5228)_cleanup_running_deleted_instances()
   5227 except Exception as e:
-> 5228 LOG.warning(_("Periodic cleanup failed to delete "
   5229 "instance: %s"),

ipdb> n
> /opt/stack/nova/nova/compute/manager.py(5230)_cleanup_running_deleted_instances()
   5229 "instance: %s"),
-> 5230 unicode(e), instance=instance)
   5231 else:

ipdb> n
2014-02-14 16:22:25.917 WARNING nova.compute.manager [-] [instance: c32db267-21a0-41e7-9d50-931d8396d8cb] Periodic cleanup failed to delete instance: Instance c32db267-21a0-41e7-9d50-931d8396d8cb could not be found.

wangpan (hzwangpan)
Changed in nova:
assignee: nobody → wangpan (hzwangpan)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/73540

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/73540
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d835163759527a651f3c4f2109ca0fdc3e968d37
Submitter: Jenkins
Branch: master

commit d835163759527a651f3c4f2109ca0fdc3e968d37
Author: Wangpan <email address hidden>
Date: Mon Feb 17 11:27:22 2014 +0800

    Fix InstanceNotFound error in _delete_instance_files

    Currently the cleanup_running_deleted_instances peroidic task
    failed with InstanceNotFound exception, this is because the
    db query is not including the deleted instance while
    _delete_instance_files() in libvirt driver.

    Closes-bug: #1280140

    Change-Id: Ie65ff255ad3c582a71db93b07304a21d4976a193

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → icehouse-3
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/80180

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/havana)

Reviewed: https://review.openstack.org/80180
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c87a88f0a531effeff571c2aa014e629e8299564
Submitter: Jenkins
Branch: stable/havana

commit c87a88f0a531effeff571c2aa014e629e8299564
Author: Wangpan <email address hidden>
Date: Mon Feb 17 11:27:22 2014 +0800

    Fix InstanceNotFound error in _delete_instance_files

    Currently the cleanup_running_deleted_instances peroidic task
    failed with InstanceNotFound exception, this is because the
    db query is not including the deleted instance while
    _delete_instance_files() in libvirt driver.

    Closes-bug: #1280140

    Change-Id: Ie65ff255ad3c582a71db93b07304a21d4976a193
    (cherry picked from d835163759527a651f3c4f2109ca0fdc3e968d37)

tags: added: in-stable-havana
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.