libvirt detach_interface logs errors for network device not found after neutron network-vif-deleted event

Bug #1536671 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann

Bug Description

I've started noticing a lot of these in the neutron job logs:

http://logs.openstack.org/67/269867/4/check/gate-tempest-dsvm-neutron-src-os-brick/7617a9f/logs/screen-n-cpu.txt.gz#_2016-01-21_05_38_48_667

2016-01-21 05:38:48.667 ERROR nova.virt.libvirt.driver [req-c8971f87-303e-460b-894b-7e2ccde9944f nova service] [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] detaching network adapter failed.
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] Traceback (most recent call last):
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1354, in detach_interface
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] guest.detach_device(cfg, persistent=True, live=live)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 341, in detach_device
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] self._domain.detachDeviceFlags(conf.to_xml(), flags=flags)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] rv = execute(f, *args, **kwargs)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] six.reraise(c, e, tb)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] rv = meth(*args, **kwargs)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 985, in detachDeviceFlags
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] if ret == -1: raise libvirtError ('virDomainDetachDeviceFlags() failed', dom=self)
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d] libvirtError: operation failed: no matching network device was found
2016-01-21 05:38:48.667 12834 ERROR nova.virt.libvirt.driver [instance: d8d15c87-79cc-4b63-99bb-64dde4576b3d]

Following the request ID, it's coming from a neutron vif deleted event:

http://logs.openstack.org/67/269867/4/check/gate-tempest-dsvm-neutron-src-os-brick/7617a9f/logs/screen-n-cpu.txt.gz#_2016-01-21_05_38_48_361

It looks like we should just handle and not log that network device not found as an error, but libvirt doesn't provide a specific error code for that case:

http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_driver.c;h=e04a32841807851fd1386ab8a7fd91672f114dce;hb=e8684eb541f01df9b45e87e0a8ce446c7bc90a17#l6764

And the error message is translatable so we can't just parse the message either.

It seems there should be a way to verify that the network device exists before we try to detach it from the config so we can avoid this error which doesn't actually result in job failures (it's probably happening during an instance delete asynchronously where we're deleting the ports in neutron).

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Caught%20error%3A%20%3Ctype%20'exceptions.AttributeError'%3E%20'module'%20object%20has%20no%20attribute%20'ServiceList'%5C%22%20AND%20tags%3A%5C%22screen-c-api.txt%5C%22&from=7d

Matt Riedemann (mriedem)
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Daniel Berrange (berrange) wrote :

> It seems there should be a way to verify that the network device exists

You can query the guest XML to see if it exists as that should reflect current state

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/270891

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/270981

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/270981
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6e6a8816083c0172988a26e70da027e6f9501db6
Submitter: Jenkins
Branch: master

commit 6e6a8816083c0172988a26e70da027e6f9501db6
Author: Matt Riedemann <email address hidden>
Date: Thu Jan 21 10:53:06 2016 -0800

    libvirt: implement LibvirtConfigGuestInterface.parse_dom

    This is needed by a follow on change that needs to find an
    interface in the config by a given MAC address.

    Change-Id: I6b9ab20570e1db832cc2eec94cbf69dddf4ef3d6
    Partial-Bug: #1536671

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/270891
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b3624e00d0097dd7ccdf27e34d7351c0c97afea1
Submitter: Jenkins
Branch: master

commit b3624e00d0097dd7ccdf27e34d7351c0c97afea1
Author: Matt Riedemann <email address hidden>
Date: Thu Jan 21 08:15:27 2016 -0800

    libvirt: check for interface when detach_interface fails

    When using Neutron and deleting an instance, we race against
    deleting the domain and Neutron sending a vif-deleted event
    which triggers a call to detach_interface. If the network
    device is not found when we go to detach it from the config,
    libvirt raises an error like:

    libvirtError: operation failed: no matching network device was found

    Unfortunately libvirt does not have a unique error code for this
    and the error message is translatable, so we can't key off of it
    to check if the failure is just due to the device not being found.

    This change adds a method to the guest object to lookup the interface
    device config by MAC address and if not found, we simply log a warning
    rather than tracing an error for a case that we can expect when using
    Neutron.

    Closes-Bug: #1536671

    Change-Id: I8ae352ff3eeb760c97d1a6fa9d7a59e881d7aea1

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/nova 13.0.0.0b3

This issue was fixed in the openstack/nova 13.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.