XenAPI: nova-compute cannot run when manually delete a VM

Bug #1693147 reported by huan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Bob Ball

Bug Description

I have deployed a development DevStack OS environment with XenServer7.0 and then I did the below steps:

1. Create an instance via Horizon or OpenStack CLI
2. Delete the instance manually (using XenCenter or xe command)
3. Stop nova-compute service and start nova-compute service again

Then I cannot start nova-compute service anymore with errors:

2017-05-24 05:54:09.373 DEBUG nova.compute.manager [req-97c683b3-cec8-46e6-bb04-d8f17a0b0be8 None None] [instance: 9d5ee1f9-ad88-48d1-b07e-730154ae8cfd] Checking state from (pid=24627) _get_power_state /opt/stack/nova/nova/compute/manager.py:1169
2017-05-24 05:54:09.380 DEBUG oslo_messaging._drivers.amqpdriver [req-97c683b3-cec8-46e6-bb04-d8f17a0b0be8 None None] CAST unique_id: 499446a84a51459aad4314c2301e7d08 FANOUT topic 'scheduler' from (pid=24627) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:478
2017-05-24 05:54:09.383 ERROR oslo_service.service [req-97c683b3-cec8-46e6-bb04-d8f17a0b0be8 None None] Error starting thread.
2017-05-24 05:54:09.383 TRACE oslo_service.service Traceback (most recent call last):
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 721, in run_service
2017-05-24 05:54:09.383 TRACE oslo_service.service service.start()
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/service.py", line 143, in start
2017-05-24 05:54:09.383 TRACE oslo_service.service self.manager.init_host()
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1148, in init_host
2017-05-24 05:54:09.383 TRACE oslo_service.service self._init_instance(context, instance)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 945, in _init_instance
2017-05-24 05:54:09.383 TRACE oslo_service.service self.driver.plug_vifs(instance, net_info)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 309, in plug_vifs
2017-05-24 05:54:09.383 TRACE oslo_service.service self._vmops.plug_vifs(instance, network_info)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/xenapi/vmops.py", line 1961, in plug_vifs
2017-05-24 05:54:09.383 TRACE oslo_service.service self.vif_driver.plug(instance, vif, device=device)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/xenapi/vif.py", line 251, in plug
2017-05-24 05:54:09.383 TRACE oslo_service.service vif_ref = self._get_vif_ref(vif, vm_ref)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/xenapi/vif.py", line 43, in _get_vif_ref
2017-05-24 05:54:09.383 TRACE oslo_service.service vif_refs = self._session.call_xenapi("VM.get_VIFs", vm_ref)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/os-xenapi/os_xenapi/client/session.py", line 200, in call_xenapi
2017-05-24 05:54:09.383 TRACE oslo_service.service return session.xenapi_request(method, args)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/opt/stack/os-xenapi/os_xenapi/client/XenAPI.py", line 130, in xenapi_request
2017-05-24 05:54:09.383 TRACE oslo_service.service result = _parse_result(getattr(self, methodname)(*full_params))
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in _call_
2017-05-24 05:54:09.383 TRACE oslo_service.service return self._send(self._name, args)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 1596, in __request
2017-05-24 05:54:09.383 TRACE oslo_service.service allow_none=self.__allow_none)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 1094, in dumps
2017-05-24 05:54:09.383 TRACE oslo_service.service data = m.dumps(params)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 638, in dumps
2017-05-24 05:54:09.383 TRACE oslo_service.service dump(v, write)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 660, in __dump
2017-05-24 05:54:09.383 TRACE oslo_service.service f(self, value, write)
2017-05-24 05:54:09.383 TRACE oslo_service.service File "/usr/lib/python2.7/xmlrpclib.py", line 664, in dump_nil
2017-05-24 05:54:09.383 TRACE oslo_service.service raise TypeError, "cannot marshal None unless allow_none is enabled"
2017-05-24 05:54:09.383 TRACE oslo_service.service TypeError: cannot marshal None unless allow_none is enabled
2017-05-24 05:54:09.383 TRACE oslo_service.service
2017-05-24 05:54:09.495 DEBUG oslo_concurrency.lockutils [req-774ec28a-fdb4-4742-bcbe-f97a4769652a None None] Acquired semaphore "singleton_lock" from (pid=24627) lock /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:212

Note:
When I delete the VM manually, OpenStack didn't notice it which is fine, but this should not block nova-compute service, nova-compute service should be able to run and provide services although there are VMs which cannot initialized during its startup process

huan (huan-xie)
Changed in nova:
assignee: nobody → huan (huan-xie)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/467926

Changed in nova:
status: New → In Progress
huan (huan-xie)
summary: - nXenAPI: ova-compute cannot run when manually delete a VM
+ XenAPI: nova-compute cannot run when manually delete a VM
Changed in nova:
assignee: huan (huan-xie) → Bob Ball (bob-ball)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/467926
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=92d8103a196c25157295789100117946dcf67304
Submitter: Jenkins
Branch: master

commit 92d8103a196c25157295789100117946dcf67304
Author: Huan Xie <email address hidden>
Date: Thu May 25 01:21:31 2017 -0700

    XenAPI: nova-compute cannot restart after manually delete VM

    When a VM is deleted accidentally (e.g. hardware problem or manually
    through the hypervisor rather than Nova), there is a mis-match of
    information where the VM is still in nova DB but not the hypervisor.
    If we start nova-compute service in this setup, it will fail due to an
    untrapped exception when plugging VIFs.

    Return an expected exception when Nova cannot find VM via xapi.

    Closes-bug: #1693147

    Change-Id: I937f5c202c9a4892e8aa56f74fad125791809f8c

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b2

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.