Delete a instance after this instance resized failed, source resource is not cleared.

Bug #1586309 reported by Charlotte Han
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Charlotte Han

Bug Description

Environment
===========
stable/mitaka

Steps to reproduce
==================
* I did boot a instance in compute node of SBCJSlot5Rack2Centos7, instance uuid was 00bc72d0-0778-4e69-bfee-b58b87dd1532.
* Then I did resize this instance, resize failed on finish_resize function on destination compute node SBCJSlot3Rack2Centos7.

[stack@SBCJSlot5Rack2Centos7 ~]$ openstack server show 00bc72d0-0778-4e69-bfee-b58b87dd1532.
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | SBCJSlot3Rack2Centos7 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | SBCJSlot3Rack2Centos7 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000014 |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2016-05-27T02:28:07.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | public=2001:db8::6, 10.43.239.76 |
| config_drive | True |
| created | 2016-05-27T02:27:56Z |
| fault | {u'message': u'Unexpected vif_type=binding_failed', u'code': 500, u'details': u' File "/opt/stack/nova/nova/compute/manager.py", line |
| | 375, in decorated_function\n return function(self, context, *args, **kwargs)\n File "/opt/stack/nova/nova/compute/manager.py", |
| | line 4054, in finish_resize\n self._set_instance_obj_error_state(context, instance)\n File "/usr/lib/python2.7/site- |
| | packages/oslo_utils/excutils.py", line 220, in __exit__\n self.force_reraise()\n File "/usr/lib/python2.7/site- |
| | packages/oslo_utils/excutils.py", line 196, in force_reraise\n six.reraise(self.type_, self.value, self.tb)\n File |
| | "/opt/stack/nova/nova/compute/manager.py", line 4042, in finish_resize\n disk_info, image_meta)\n File |
| | "/opt/stack/nova/nova/compute/manager.py", line 4007, in _finish_resize\n old_instance_type)\n File "/usr/lib/python2.7/site- |
| | packages/oslo_utils/excutils.py", line 220, in __exit__\n self.force_reraise()\n File "/usr/lib/python2.7/site- |
| | packages/oslo_utils/excutils.py", line 196, in force_reraise\n six.reraise(self.type_, self.value, self.tb)\n File |
| | "/opt/stack/nova/nova/compute/manager.py", line 4002, in _finish_resize\n block_device_info, power_on)\n File |
| | "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7405, in finish_migration\n write_to_disk=True)\n File |
| | "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4718, in _get_guest_xml\n context)\n File |
| | "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4584, in _get_guest_config\n flavor, virt_type, self._host)\n File |
| | "/opt/stack/nova/nova/virt/libvirt/vif.py", line 447, in get_config\n _("Unexpected vif_type=%s") % vif_type)\n', u'created': |
| | u'2016-05-27T03:17:57Z'} |
| flavor | m1.tiny (1) |
| hostId | 7642c4dc55a8bc40c8a0fe2197a1b5333e5d0f028119be752c4098a0 |
| id | 00bc72d0-0778-4e69-bfee-b58b87dd1532 |
| image | cirros-0.3.4-x86_64-uec (f19c4e53-d6f6-43bb-b983-b77b0c68f509) |
| key_name | None |
| name | hanrong_az |
| os-extended-volumes:volumes_attached | [] |
| project_id | 4d6e4e79ea1f4ec392475308e11a895d |
| properties | |
| security_groups | [{u'name': u'default'}] |
| status | ERROR |
| updated | 2016-05-27T03:17:57Z |
| user_id | 2903b30f6d9f4c4eb99421e3b4e51796 |
+--------------------------------------+-------------------------------------------------------------------

MariaDB [nova]> select source_compute,source_compute,status,instance_uuid,instance_uuid from migrations where instance_uuid='00bc72d0-0778-4e69-bfee-b58b87dd1532';
+-----------------------+-----------------------+--------+--------------------------------------+--------------------------------------+
| source_compute | source_compute | status | instance_uuid | instance_uuid |
+-----------------------+-----------------------+--------+--------------------------------------+--------------------------------------+
| SBCJSlot5Rack2Centos7 | SBCJSlot5Rack2Centos7 | error | 00bc72d0-0778-4e69-bfee-b58b87dd1532 | 00bc72d0-0778-4e69-bfee-b58b87dd1532 |
+-----------------------+-----------------------+--------+--------------------------------------+--------------------------------------

* then I did delete this instance successfully. But the instance's data is not cleared on source compute node.
[stack@SBCJSlot5Rack2Centos7 instances]$ ll
drwxrwxr-x 2 stack libvirtd 4096 May 27 10:27 00bc72d0-0778-4e69-bfee-b58b87dd1532_resize

[stack@SBCJSlot5Rack2Centos7 instances]$ sudo virsh list --all
 Id Name State
----------------------------------------------------
 - instance-00000014 shut off

Expected result
===============
Instance's data is cleared after it had been deleted.

Charlotte Han (hanrong)
Changed in nova:
assignee: nobody → Charlotte Han (hanrong)
Revision history for this message
Charlotte Han (hanrong) wrote :

Resize failed on finish_resize, it has no rollback process, I think we could modify this point.

Changed in nova:
assignee: Charlotte Han (hanrong) → nobody
Charlotte Han (hanrong)
Changed in nova:
assignee: nobody → Charlotte Han (hanrong)
Charlotte Han (hanrong)
Changed in nova:
status: New → In Progress
Charlotte Han (hanrong)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Rajesh Tailor (ratailor) wrote :

Hi Charlotte,

There is a periodic task "_cleanup_incomplete_migrations" which takes care of instance files on source/destination compute node (which should be other than instance.host).

This periodic task runs at interval of times configured for 'instance_delete_interval' parameter, default value for which is 300.

For your scenario, you can check instance files on destination compute node, once this periodic task runs successfully on destination compute node. IMO this periodic task should delete instance files, if not than this is a valid bug.

Revision history for this message
Charlotte Han (hanrong) wrote :

Hi, Rajesh
Thank you for reminding me.

The _cleanup_incomplete_migrations execute deleteing files again on destination compute node, and migration.status was set failed.

But the files on source node still exists..

[stack@SBCJSlot5Rack2Centos7 instances]$ ll
drwxrwxr-x 2 stack libvirtd 4096 May 27 10:27 00bc72d0-0778-4e69-bfee-b58b87dd1532_resize

[stack@SBCJSlot5Rack2Centos7 instances]$ sudo virsh list --all
 Id Name State
----------------------------------------------------
 - instance-00000014 shut off

Revision history for this message
Charlotte Han (hanrong) wrote :

I think periodic task "_run_pending_deletes" can resolve cleanup instance's resources on instance.host which may on source or destination compute node.

So the periodic task of cleanup_incomplete_migrations could be resolve cleanup instance's resources on source or destination compute node which is not equal to instance.host.

Revision history for this message
Charlotte Han (hanrong) wrote :

I looked at the comment of unit test for periodic task "cleanup_incomplete_migrations", I found the Unit Test Cases are due to delete resource on source or destination node which not is instance.host.

But the test scene is not correct. I report that:
https://bugs.launchpad.net/nova/+bug/1589181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/325635

Revision history for this message
Rajesh Tailor (ratailor) wrote :

Hi Charlotte,

IMO the periodic task "cleanup_incomplete_migrations" will take care of deleting instance files from host (which is not instance host) and since you have executed delete request on instance, then it is the job of delete api to delete instance files from instance host.

So in that case, IMO the periodic task is doing its job correctly.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Rong Han (<email address hidden>) on branch: master
Review: https://review.openstack.org/325635
Reason: https://review.openstack.org/#/c/326262 is merged

Revision history for this message
Charlotte Han (hanrong) wrote :
Rajesh Tailor (ratailor)
Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

As we use the "direct-release" model in Nova we don't use the
"Fix Comitted" status for merged bug fixes anymore. I'm setting
this manually to "Fix Released" to be consistent.

[1] "[openstack-dev] [release][all] bugs will now close automatically
    when patches merge"; Doug Hellmann; 2015-12-07;
    http://lists.openstack.org/pipermail/openstack-dev/2015-December/081612.html

Changed in nova:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.