Traceback from "detaching network adapter failed.: libvirtError: internal error: End of file from qemu monitor" when network-vif-delete races with server delete

Bug #1797966 reported by Matt Riedemann on 2018-10-15
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Matt Riedemann
Pike
Medium
Unassigned
Queens
Medium
Unassigned
Rocky
Medium
Unassigned

Bug Description

Seen here:

http://logs.openstack.org/98/610098/2/check/tempest-full/e51c5d8/testr_results.html.gz

ft1.1: setUpClass (tempest.api.compute.servers.test_create_server.ServersTestJSON)_StringException: Traceback (most recent call last):
  File "tempest/test.py", line 172, in setUpClass
    six.reraise(etype, value, trace)
  File "tempest/test.py", line 165, in setUpClass
    cls.resource_setup()
  File "tempest/api/compute/servers/test_create_server.py", line 64, in resource_setup
    volume_backed=cls.volume_backed)
  File "tempest/api/compute/base.py", line 246, in create_test_server
    **kwargs)
  File "tempest/common/compute.py", line 258, in create_test_server
    server['id'])
  File "/opt/stack/tempest/.tox/tempest/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/opt/stack/tempest/.tox/tempest/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "tempest/common/compute.py", line 229, in create_test_server
    clients.servers_client, server['id'], wait_until)
  File "tempest/common/waiters.py", line 96, in wait_for_server_status
    raise lib_exc.TimeoutException(message)
tempest.lib.exceptions.TimeoutException: Request timed out
Details: (ServersTestJSON:setUpClass) Server 16679c49-ba5b-4c39-a77b-1e3a45c628b5 failed to reach ACTIVE status and task state "None" within the required time (196 s). Current status: BUILD. Current task state: spawning.

The real failure is in the compute logs:

http://logs.openstack.org/98/610098/2/check/tempest-full/e51c5d8/controller/logs/screen-n-cpu.txt.gz#_Oct_15_17_15_17_661456

Oct 15 17:15:17.661456 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [None req-7cd1f34e-77cd-4e7c-a38f-927d183bcb2f service nova] [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] detaching network adapter failed.: libvirtError: internal error: End of file from qemu monitor
Oct 15 17:15:17.661716 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] Traceback (most recent call last):
Oct 15 17:15:17.661934 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1723, in detach_interface
Oct 15 17:15:17.662180 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] alternative_device_name=self.vif_driver.get_vif_devname(vif))
Oct 15 17:15:17.662401 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 430, in detach_device_with_retry
Oct 15 17:15:17.662623 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] _try_detach_device(conf, persistent, live)
Oct 15 17:15:17.662848 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 419, in _try_detach_device
Oct 15 17:15:17.663110 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] device=alternative_device_name)
Oct 15 17:15:17.663313 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Oct 15 17:15:17.663522 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] self.force_reraise()
Oct 15 17:15:17.663742 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Oct 15 17:15:17.663974 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] six.reraise(self.type_, self.value, self.tb)
Oct 15 17:15:17.664193 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 397, in _try_detach_device
Oct 15 17:15:17.664404 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] self.detach_device(conf, persistent=persistent, live=live)
Oct 15 17:15:17.664604 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 480, in detach_device
Oct 15 17:15:17.664828 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] self._domain.detachDeviceFlags(device_xml, flags=flags)
Oct 15 17:15:17.665050 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 190, in doit
Oct 15 17:15:17.665271 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] result = proxy_call(self._autowrap, f, *args, **kwargs)
Oct 15 17:15:17.665528 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 148, in proxy_call
Oct 15 17:15:17.665735 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] rv = execute(f, *args, **kwargs)
Oct 15 17:15:17.665950 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 129, in execute
Oct 15 17:15:17.666151 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] six.reraise(c, e, tb)
Oct 15 17:15:17.666288 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Oct 15 17:15:17.666436 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] rv = meth(*args, **kwargs)
Oct 15 17:15:17.666647 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1223, in detachDeviceFlags
Oct 15 17:15:17.666773 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] if ret == -1: raise libvirtError ('virDomainDetachDeviceFlags() failed', dom=self)
Oct 15 17:15:17.666903 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] libvirtError: internal error: End of file from qemu monitor
Oct 15 17:15:17.667071 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: ERROR nova.virt.libvirt.driver [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5]
Oct 15 17:15:17.667193 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: WARNING nova.compute.manager [None req-7cd1f34e-77cd-4e7c-a38f-927d183bcb2f service nova] [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] Detach interface failed, port_id=2bee4fe2-2c28-4932-bc54-d4c6935ae9e7, reason: Failed to detach network adapter device from 16679c49-ba5b-4c39-a77b-1e3a45c628b5: InterfaceDetachFailed: Failed to detach network adapter device from 16679c49-ba5b-4c39-a77b-1e3a45c628b5
Oct 15 17:15:17.667322 ubuntu-xenial-ovh-bhs1-0003042445 nova-compute[14841]: INFO nova.virt.libvirt.driver [-] [instance: 16679c49-ba5b-4c39-a77b-1e3a45c628b5] Instance destroyed successfully.

What happens is the instance delete is racing with a network-vif-deleted event and when we're trying to detach the network interface the guest is gone. This is similar to bug 1536671 but must show up in a different way with newer libvirt (this is with libvirt 4.0.0 and qemu 2.11).

Looking at this change:

https://review.openstack.org/#/c/270891/4/nova/virt/libvirt/driver.py

The one issue there is we're using a stale guest object. We should try to get a new instance of the guest object from the host and if the guest is gone we should get an InstanceNotFound being raised, which is the appropriate thing in this case and would be handled and logged in the compute manager.

Fix proposed to branch: master
Review: https://review.openstack.org/610727

Changed in nova:
status: Triaged → In Progress
Matt Riedemann (mriedem) wrote :

I've corrected the title of the bug since it doesn't cause a failure in tempest as far as I can tell, it's just a race which generates a confusing traceback in the logs which makes it hard to debug real failures.

summary: - Tempest fails with "detaching network adapter failed.: libvirtError:
- internal error: End of file from qemu monitor"
+ Traceback from "detaching network adapter failed.: libvirtError:
+ internal error: End of file from qemu monitor" when network-vif-delete
+ races with server delete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers