Uncaught 'libvirtError: Domain not found' errors during destroy

Bug #1368404 reported by Hans Lindgren
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Hans Lindgren

Bug Description

Some uncaught libvirt errors may result in instances being set to ERROR state and is causing sporadic gate failures. This can happen for any of the code paths that use _destroy(). Here is a recent example of a failed resize:

[req-06dd4908-382e-455e-854e-e4d42a4bf62b TestServerAdvancedOps-724416891 TestServerAdvancedOps-711228572] [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] Setting instance vm_state to ERROR
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] Traceback (most recent call last):
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/compute/manager.py", line 5902, in _error_out_instance_on_exception
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] yield
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/compute/manager.py", line 3658, in resize_instance
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] timeout, retry_interval)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5468, in migrate_disk_and_power_off
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] self.power_off(instance, timeout, retry_interval)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2400, in power_off
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] self._destroy(instance)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 998, in _destroy
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] timer.start(interval=0.5).wait()
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] return hubs.get_hub().switch()
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 293, in switch
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] return self.greenlet.switch()
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/openstack/common/loopingcall.py", line 81, in _inner
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] self.f(*self.args, **self.kw)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 971, in _wait_for_destroy
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] dom_info = self.get_info(instance)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3922, in get_info
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] dom_info = virt_dom.info()
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] result = proxy_call(self._autowrap, f, *args, **kwargs)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] rv = execute(f, *args, **kwargs)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] six.reraise(c, e, tb)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] rv = meth(*args, **kwargs)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1068, in info
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] if ret is None: raise libvirtError ('virDomainGetInfo() failed', dom=self)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d] libvirtError: Domain not found: no domain with matching uuid '525f4f95-f631-4fbb-a884-20c37711fb0d' (instance-00000097)
2014-09-05 01:08:37.123 26984 TRACE nova.compute.manager [instance: 525f4f95-f631-4fbb-a884-20c37711fb0d]

Tags: libvirt
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/120912

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/120912
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c860280a92c2d0e32897d71aa4c6f083ee13345c
Submitter: Jenkins
Branch: master

commit c860280a92c2d0e32897d71aa4c6f083ee13345c
Author: Hans Lindgren <email address hidden>
Date: Thu Sep 11 17:32:19 2014 +0200

    Make sure libvirt VIR_ERR_NO_DOMAIN errors are handled correctly

    Some libvirt VIR_ERR_NO_DOMAIN errors leak outside of the libvirt
    driver, which may result in instances being set to ERROR state.

    Make sure such errors are either ignored (during destroy) or
    otherwise turned into InstanceNotFound errors.

    Change-Id: I0ac2989062247a05162825760367488f4f90bd04
    Closes-Bug: #1368404

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.