live-migrate left in migrating as domain not found

Bug #1662626 reported by John Garbutt
32
This bug affects 7 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
John Garbutt
Newton
Fix Committed
Medium
Shane Peters
Ocata
Fix Committed
Medium
Matt Riedemann
Ubuntu Cloud Archive
Fix Released
Medium
Shane Peters
Mitaka
Triaged
Medium
Unassigned
Newton
Fix Released
Medium
Unassigned
Ocata
Fix Released
Medium
Unassigned
Pike
Fix Released
Medium
Shane Peters
nova (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Triaged
Medium
Unassigned
Zesty
Fix Released
Medium
Unassigned
Artful
Fix Released
Medium
Unassigned

Bug Description

A live-migration stress test was working fine when suddenly a VM stopped migrating. It failed with this error:

ERROR nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Error from libvirt during undefine. Code=42 Error=Domain not found: no domain with matching uuid '62034d78-3144-4efd-9c2c-8a792aed3d6b' (instance-00000431)

The full stack trace:

2017-02-05 02:33:41.787 19770 INFO nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Migration running for 240 secs, memory 9% remaining; (bytes processed=15198240264, remaining=1680875520, total=17314955264)
2017-02-05 02:33:45.795 19770 INFO nova.compute.manager [req-abff9c69-5f82-4ed6-af8a-fd1dc81a72a6 - - - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] VM Paused (Lifecycle Event)
2017-02-05 02:33:45.870 19770 INFO nova.compute.manager [req-abff9c69-5f82-4ed6-af8a-fd1dc81a72a6 - - - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] During sync_power_state the instance has a pending task (migrating). Skip.
2017-02-05 02:33:45.883 19770 INFO nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Migration operation has completed
2017-02-05 02:33:45.884 19770 INFO nova.compute.manager [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] _post_live_migration() is started..
2017-02-05 02:33:46.156 19770 INFO os_vif [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] Successfully unplugged vif VIFBridge(active=True,address=fa:16:3e:a2:90:55,bridge_name='brq476ab6ba-b3',has_traffic_filtering=True,id=98d476b3-0ead-4adb-ad54-1dff63edcd65,network=Network(476ab6ba-b32e-409e-9711-9412e8475ea0),plugin='linux_bridge',port_profile=<?>,preserve_on_delete=True,vif_name='tap98d476b3-0e')
2017-02-05 02:33:46.189 19770 INFO nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Deleting instance files /var/lib/nova/instances/62034d78-3144-4efd-9c2c-8a792aed3d6b_del
2017-02-05 02:33:46.195 19770 INFO nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Deletion of /var/lib/nova/instances/62034d78-3144-4efd-9c2c-8a792aed3d6b_del complete

2017-02-05 02:33:46.334 19770 ERROR nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Error from libvirt during undefine. Code=42 Error=Domain not found: no domain with matching uuid '62034d78-3144-4efd-9c2c-8a792aed3d6b' (instance-00000431)

2017-02-05 02:33:46.363 19770 WARNING nova.virt.libvirt.driver [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Error monitoring migration: Domain not found: no domain with matching uuid '62034d78-3144-4efd-9c2c-8a792aed3d6b' (instance-00000431)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Traceback (most recent call last):
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6345, in _live_migration
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] finish_event, disk_paths)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6255, in _live_migration_monitor
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] migrate_data)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] function_name, call_dict, binary)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] return f(self, context, *args, **kw)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] kwargs['instance'], e, sys.exc_info())
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] return function(self, context, *args, **kwargs)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 5470, in _post_live_migration
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] destroy_vifs=destroy_vifs)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 976, in cleanup
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self._undefine_domain(instance)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 874, in _undefine_domain
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] {'errcode': errcode, 'e': e}, instance=instance)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 868, in _undefine_domain
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] guest.delete_configuration()
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 266, in delete_configuration
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self._domain.undefine()
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] result = proxy_call(self._autowrap, f, *args, **kwargs)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] rv = execute(f, *args, **kwargs)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(c, e, tb)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] rv = meth(*args, **kwargs)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/libvirt.py", line 2701, in undefine
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] if ret == -1: raise libvirtError ('virDomainUndefine() failed', dom=self)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] libvirtError: Domain not found: no domain with matching uuid '62034d78-3144-4efd-9c2c-8a792aed3d6b' (instance-00000431)
2017-02-05 02:33:46.363 19770 ERROR nova.virt.libvirt.driver [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b]
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [req-df91ac40-820f-4aa9-945b-b2fce73461f8 29c0371e35f84fdaa033f2dbfe2c042c 669472610b194bfa9bf03f50f86d725a - - -] [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Live migration failed.
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] Traceback (most recent call last):
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 5261, in _do_live_migration
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] block_migration, migrate_data)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5775, in live_migration
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] migrate_data)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6345, in _live_migration
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] finish_event, disk_paths)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6255, in _live_migration_monitor
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] migrate_data)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] function_name, call_dict, binary)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] return f(self, context, *args, **kw)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] kwargs['instance'], e, sys.exc_info())
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] return function(self, context, *args, **kwargs)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/compute/manager.py", line 5470, in _post_live_migration
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] destroy_vifs=destroy_vifs)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 976, in cleanup
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self._undefine_domain(instance)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 874, in _undefine_domain
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] {'errcode': errcode, 'e': e}, instance=instance)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self.force_reraise()
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(self.type_, self.value, self.tb)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 868, in _undefine_domain
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] guest.delete_configuration()
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 266, in delete_configuration
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] self._domain.undefine()
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] result = proxy_call(self._autowrap, f, *args, **kwargs)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] rv = execute(f, *args, **kwargs)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] six.reraise(c, e, tb)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] rv = meth(*args, **kwargs)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] File "/openstack/venvs/nova-14.0.4/lib/python2.7/site-packages/libvirt.py", line 2701, in undefine
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] if ret == -1: raise libvirtError ('virDomainUndefine() failed', dom=self)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b] libvirtError: Domain not found: no domain with matching uuid '62034d78-3144-4efd-9c2c-8a792aed3d6b' (instance-00000431)
2017-02-05 02:33:46.364 19770 ERROR nova.compute.manager [instance: 62034d78-3144-4efd-9c2c-8a792aed3d6b]

Revision history for this message
John Garbutt (johngarbutt) wrote :

Still digging on the details with this bug. Looking at if we keep seeing the same thing several times.

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/430400

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/430404

Revision history for this message
John Garbutt (johngarbutt) wrote :

So there are kind of two problems here, a race in undefine domain that leads to occasional live-migration failures when you run lots and lots of live-migrations.

On top of that, when errors happen in that part of the code, we don't set the instance to the error state, so the instance just stays in the migrating state.

Changed in nova:
importance: Undecided → Medium
tags: added: ocata-rc-potential
Revision history for this message
Matt Riedemann (mriedem) wrote :

Is this really an ocata release candidate potential bug? It sounds pretty latent and something we can backport to stable/ocata after 15.0.0 is released. Or was this the result of a regression in Ocata itself?

Matt Riedemann (mriedem)
tags: added: ocata-backport-potential
removed: ocata-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/430400
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b706155888d740841d745c52aae543cf82fab0bc
Submitter: Jenkins
Branch: master

commit b706155888d740841d745c52aae543cf82fab0bc
Author: John Garbutt <email address hidden>
Date: Tue Feb 7 18:55:26 2017 +0000

    Stop _undefine_domain erroring if domain not found

    During live-migration stress testing we are seeing the following log:
    Error from libvirt during undefine. Code=42 Error=Domain not found

    There appears to be a race while trying to undefine the domain, and
    something else is possibly also doing some kind of clean up. While this
    does paper over that race, it stops otherwise completed live-migrations
    from failing. It also matches similar error handling done for when
    deleting the domain.

    The next part of the bug fix is to ensure if we have any similar
    unexpected errors during this later phase of the live-migration we don't
    leave the instance stuck in the migrating state, it should move to an
    ERROR state. This is covered in a follow on patch.

    Partial-Bug: #1662626

    Change-Id: I23ed9819061bfa436b12180110666c5b8c3e0f70

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/430404
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b56f8fc2d1392f4675a5baae0977e4817a362159
Submitter: Jenkins
Branch: master

commit b56f8fc2d1392f4675a5baae0977e4817a362159
Author: John Garbutt <email address hidden>
Date: Tue Feb 7 19:12:50 2017 +0000

    Stop failed live-migrates getting stuck migrating

    When there are failures in driver.cleanup, we are seeing live-migrations
    that get stuck in the live-migrating state. While there has been a patch
    to stop the cause listed in the bug this closes, there are other
    failures (such as a token timeout when talking to cinder or neutron)
    that could trigger this same failure mode.

    When we hit an error this late in live-migration, it should be a very
    rare event, so its best to just put the instance and migration into an
    error state, and help alert both the operator and API user to the
    failure that has occurred.

    Closes-Bug: #1662626

    Change-Id: Idfdce9e7dd8106af01db0358ada15737cb846395

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b1

This issue was fixed in the openstack/nova 16.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/457023

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/457023
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=012fa9353ff18d3d52120fd42a2ec25dbc5a02b7
Submitter: Jenkins
Branch: stable/ocata

commit 012fa9353ff18d3d52120fd42a2ec25dbc5a02b7
Author: John Garbutt <email address hidden>
Date: Tue Feb 7 19:12:50 2017 +0000

    Stop failed live-migrates getting stuck migrating

    When there are failures in driver.cleanup, we are seeing live-migrations
    that get stuck in the live-migrating state. While there has been a patch
    to stop the cause listed in the bug this closes, there are other
    failures (such as a token timeout when talking to cinder or neutron)
    that could trigger this same failure mode.

    When we hit an error this late in live-migration, it should be a very
    rare event, so its best to just put the instance and migration into an
    error state, and help alert both the operator and API user to the
    failure that has occurred.

    Closes-Bug: #1662626

    Change-Id: Idfdce9e7dd8106af01db0358ada15737cb846395
    (cherry picked from commit b56f8fc2d1392f4675a5baae0977e4817a362159)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.4

This issue was fixed in the openstack/nova 15.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/470387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/476278

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/mitaka)

Change abandoned by Joshua Hesketh (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/476278
Reason: This branch (stable/mitaka) is at End Of Life

Matt Riedemann (mriedem)
tags: added: libvirt
removed: ocata-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/470387
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=017e853b950ddc10dab4ebab37e64cece40f274f
Submitter: Jenkins
Branch: stable/newton

commit 017e853b950ddc10dab4ebab37e64cece40f274f
Author: John Garbutt <email address hidden>
Date: Tue Feb 7 19:12:50 2017 +0000

    Stop failed live-migrates getting stuck migrating

    When there are failures in driver.cleanup, we are seeing live-migrations
    that get stuck in the live-migrating state. While there has been a patch
    to stop the cause listed in the bug this closes, there are other
    failures (such as a token timeout when talking to cinder or neutron)
    that could trigger this same failure mode.

    When we hit an error this late in live-migration, it should be a very
    rare event, so its best to just put the instance and migration into an
    error state, and help alert both the operator and API user to the
    failure that has occurred.

    For backport into Newton, 'migrate_instance_start' had to be patched
    in the unit test (nova/tests/unit/compute/test_compute.py).

    Closes-Bug: #1662626

    Change-Id: Idfdce9e7dd8106af01db0358ada15737cb846395
    (cherry picked from commit b56f8fc2d1392f4675a5baae0977e4817a362159)
    (cherry picked from commit 012fa9353ff18d3d52120fd42a2ec25dbc5a02b7)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.8

This issue was fixed in the openstack/nova 14.0.8 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/508640

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/508641

Changed in cloud-archive:
importance: Undecided → Medium
assignee: nobody → Shane Peters (shaner)
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/ocata)

Change abandoned by Liping Mao (<email address hidden>) on branch: stable/ocata
Review: https://review.openstack.org/508640

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/newton)

Change abandoned by Liping Mao (<email address hidden>) on branch: stable/newton
Review: https://review.openstack.org/508641

Changed in cloud-archive:
status: Confirmed → Fix Released
Changed in nova (Ubuntu):
status: New → Fix Released
importance: Undecided → Medium
Changed in nova (Ubuntu Zesty):
status: New → Fix Released
Changed in nova (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → Medium
Changed in nova (Ubuntu Zesty):
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/508640
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a0525f650af1fa763eb7e58185644f9e7fe1a3dd
Submitter: Zuul
Branch: stable/ocata

commit a0525f650af1fa763eb7e58185644f9e7fe1a3dd
Author: John Garbutt <email address hidden>
Date: Tue Feb 7 18:55:26 2017 +0000

    Stop _undefine_domain erroring if domain not found

    During live-migration stress testing we are seeing the following log:
    Error from libvirt during undefine. Code=42 Error=Domain not found

    There appears to be a race while trying to undefine the domain, and
    something else is possibly also doing some kind of clean up. While this
    does paper over that race, it stops otherwise completed live-migrations
    from failing. It also matches similar error handling done for when
    deleting the domain.

    The next part of the bug fix is to ensure if we have any similar
    unexpected errors during this later phase of the live-migration we don't
    leave the instance stuck in the migrating state, it should move to an
    ERROR state. This is covered in a follow on patch.

    Partial-Bug: #1662626

    Change-Id: I23ed9819061bfa436b12180110666c5b8c3e0f70
    (cherry picked from commit b706155888d740841d745c52aae543cf82fab0bc)

tags: added: in-stable-ocata
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.