Failed to start nova-compute after evacuate

Bug #1385484 reported by Feilong Wang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Feilong Wang
Icehouse
Fix Released
Undecided
Unassigned
Juno
Fix Released
Undecided
Unassigned

Bug Description

After evacuated successfully, and restarting the failed host to get it back. User will run into below error.

<179>Sep 23 01:48:35 node-1 nova-compute 2014-09-23 01:48:35.346 13206 ERROR nova.openstack.common.threadgroup [-] error removing image
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 117, in wait
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup x.wait()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 49, in wait
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 168, in wait
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194, in main
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py", line 483, in run_service
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup service.start()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/service.py", line 163, in start
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup self.manager.init_host()
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1018, in init_host
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup self._destroy_evacuated_instances(context)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 712, in _destroy_evacuated_instances
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup bdi, destroy_disks)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 962, in destroy
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup destroy_disks, migrate_data)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1080, in cleanup
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup self._cleanup_rbd(instance)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1090, in _cleanup_rbd
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup LibvirtDriver._get_rbd_driver().cleanup_volumes(instance)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/rbd_utils.py", line 238, in cleanup_volumes
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup self.rbd.RBD().remove(client.ioctx, volume)
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/rbd.py", line 300, in remove
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup raise make_ex(ret, 'error removing image')
2014-09-23 01:48:35.346 13206 TRACE nova.openstack.common.threadgroup ImageBusy: error removing image

Revision history for this message
Feilong Wang (flwang) wrote :
Changed in nova:
assignee: nobody → Fei Long Wang (flwang)
importance: Undecided → High
Feilong Wang (flwang)
summary: - Failed to destroy evacuated disks on init_host
+ Failed to start nova-compute after evacuate
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/130905

Changed in nova:
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: juno-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/131626

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/130905
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=296d92bd44d1b8eb161f94f70cba5db4d17f8f65
Submitter: Jenkins
Branch: master

commit 296d92bd44d1b8eb161f94f70cba5db4d17f8f65
Author: Fei Long Wang <email address hidden>
Date: Sat Oct 25 10:05:57 2014 +1300

    Fix nova-compute start issue after evacuate

    After evacuated successfully, and restarting the failed
    host to get it back, Nova will call init_host() and then
    call method _destroy_evacuated_instances(). In method
    _destroy_evacuated_instances(), nova will check again if
    the storage is shared or not to decide if the storage
    should be destroyed. Now nova is using temp file to check
    if it's shared file system, but it's wrong for RBD case.
    So Nova will attempt to delete the shared block storage,
    which will fail since it's used by the new instance. This
    patch fixes this issue and adds test cases for that.

    Closes-Bug: 1385484

    Change-Id: I71bb818f3c2930b3a2ddf1817dfd4bb61fae7e98

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/131631

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/juno)

Reviewed: https://review.openstack.org/131626
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ce1d1dd3dda0097d97f0b23b01d18dc1e3a9569a
Submitter: Jenkins
Branch: stable/juno

commit ce1d1dd3dda0097d97f0b23b01d18dc1e3a9569a
Author: Fei Long Wang <email address hidden>
Date: Sat Oct 25 10:05:57 2014 +1300

    Fix nova-compute start issue after evacuate

    After evacuated successfully, and restarting the failed
    host to get it back, Nova will call init_host() and then
    call method _destroy_evacuated_instances(). In method
    _destroy_evacuated_instances(), nova will check again if
    the storage is shared or not to decide if the storage
    should be destroyed. Now nova is using temp file to check
    if it's shared file system, but it's wrong for RBD case.
    So Nova will attempt to delete the shared block storage,
    which will fail since it's used by the new instance. This
    patch fixes this issue and adds test cases for that.

    Closes-Bug: 1385484

    Change-Id: I71bb818f3c2930b3a2ddf1817dfd4bb61fae7e98
    (cherry picked from commit 296d92bd44d1b8eb161f94f70cba5db4d17f8f65)

tags: added: in-stable-juno
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/131631
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fe289fb9fd19001eae1f3c2636301b055c0c8b8d
Submitter: Jenkins
Branch: stable/icehouse

commit fe289fb9fd19001eae1f3c2636301b055c0c8b8d
Author: Fei Long Wang <email address hidden>
Date: Sat Oct 25 10:05:57 2014 +1300

    Fix nova-compute start issue after evacuate

    After evacuated successfully, and restarting the failed
    host to get it back, Nova will call init_host() and then
    call method _destroy_evacuated_instances(). In method
    _destroy_evacuated_instances(), nova will check again if
    the storage is shared or not to decide if the storage
    should be destroyed. Now nova is using temp file to check
    if it's shared file system, but it's wrong for RBD case.
    So Nova will attempt to delete the shared block storage,
    which will fail since it's used by the new instance. This
    patch fixes this issue and adds test cases for that.

    Closes-Bug: 1385484

    Conflicts:
            nova/tests/virt/libvirt/test_libvirt.py

    Change-Id: I71bb818f3c2930b3a2ddf1817dfd4bb61fae7e98
    (cherry picked from commit 296d92bd44d1b8eb161f94f70cba5db4d17f8f65)
    (cherry picked from commit ce1d1dd3dda0097d97f0b23b01d18dc1e3a9569a)

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.