instance on source host can not be cleaned after evacuating

Bug #1441950 reported by Eric Xie
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

1. Version
nova: 2014.1
hypervisor: rhel7 + libvirt + kvm

2. Description
After one instance was evacuated from hostA to hostB, then delete this instance.
Then started 'nova-compute' service of hostA, and found in nova-compute.log:
2015-04-09 10:39:52.201 1977 WARNING nova.compute.manager [-] Found 0 in the database and 1 on the hypervisor.

3. Reproduce steps:
* Launch one instance INST on hostA
* Stop 'nova-compute' service on hostA, and wait for down(use 'nova service-list')
* Evacuate INST to hostB
* After evacuated successfully, delete INST
* Start 'nova-compute' service on hostA

Expected results:
* INST on hostA's hypervisor should be destroyed

Actual result:
* INST was alive on hostA's hypervisor.

4. Tips
I checked the source, and found:
nova.compute.manager.py
def _destroy_evacuated_instances(self, context):
....
        filters = {'deleted': False} # Here filtered the deleted instance. Is it more proper that checked the deleted instances?
        local_instances = self._get_instances_on_driver(context, filters)

Revision history for this message
Zhenyu Zheng (zhengzhenyu) wrote :

Bug reproduced exactly as mentioned above, with:
Verstion: Nova 2014.2.2
Hypervisor: Ubuntu 14.04 + libvirt + kvm

Changed in nova:
status: New → Confirmed
Changed in nova:
assignee: nobody → Zhenyu Zheng (zhengzhenyu)
Revision history for this message
John Garbutt (johngarbutt) wrote :

So evacuate should only be used on hosts that are never brought back to life, hence the mark host down API call we now have.

Given a few months ago significant rework of evacuate has been done. Do we still see this error on master?

Changed in nova:
status: Confirmed → Incomplete
Revision history for this message
Adam Spiers (adam.spiers) wrote :

> So evacuate should only be used on hosts that are never brought back to
> life, hence the mark host down API call we now have.

This sounds like a big surprise to me, so I hope you don't mind if I ask for some clarification, given that this bug affects us too:

You are saying that if a machine running nova-compute has a hardware failure, and it is desired to bring that machine back to life after fixing the hardware failure, running "nova evacuate" is not an appropriate first step in the remediation?

Or it is an appropriate first step, but the machine should be reinstalled from scratch before resurrecting it?

Or it should be appropriate, but there is some bug in nova preventing it from working which needs fixing? (In which case, please let me know if there is a way we can help with that.)

Or something else?

Thanks a lot!

Sean Dague (sdague)
Changed in nova:
assignee: Zhenyu Zheng (zhengzhenyu) → nobody
Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version icehouse in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.icehouse
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.