Deleting network namespaces sometimes fails in check/gate queue with ENOENT

Bug #1795482 reported by Brian Haley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Brian Haley

Bug Description

I have seen the fullstack tests sometimes fail, complaining that the namespace doesn't exist. An example is here:

http://logs.openstack.org/79/604179/1/check/neutron-fullstack-python36/9b41cd5/logs/testr_results.html.gz

End of stack trace for reference:

 File "/opt/stack/new/neutron/neutron/tests/fullstack/resources/process.py", line 354, in clean_dhcp_namespaces
    ip_lib.delete_network_namespace(namespace)
  File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1103, in delete_network_namespace
    privileged.remove_netns(namespace, **kwargs)
  File "/opt/stack/new/neutron/.tox/dsvm-fullstack-python35/lib/python3.5/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap
    return self.channel.remote_call(name, args, kwargs)
  File "/opt/stack/new/neutron/.tox/dsvm-fullstack-python35/lib/python3.5/site-packages/oslo_privsep/daemon.py", line 202, in remote_call
    raise exc_type(*result[2])
FileNotFoundError: [Errno 2] No such file or directory

While some callers check for RuntimeError, none check for this OSError errno.ENOENT case.

In this case, I don't believe we should be returning an error at all, since an asynchronous event could have deleted the namespace, and since it's no longer there we are in the desired state.

This will help with some of the recent issues we've had getting code merged.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/607009

Changed in neutron:
status: New → In Progress
tags: added: gate-failure
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like this started ~Sept 26.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/607009
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=00de8f9a9e36bbca6ef7f0c17c2e6f74a144a358
Submitter: Zuul
Branch: master

commit 00de8f9a9e36bbca6ef7f0c17c2e6f74a144a358
Author: Brian Haley <email address hidden>
Date: Mon Oct 1 13:49:32 2018 -0400

    Do not fail deleting namespace if it does not exist

    Note: this is a squash of two changes since they are
    dependent on each other, and are currently blocking
    the gate queue.

    Sometimes cleanup methods are failing in the check and
    gate queues trying to delete non-existing namespaces.
    Since they could have been deleted asynchronously, don't
    raise if the failure is "No such file or directory" since
    the system is in the intended state.

    Cleaned-up the DHCP agent to longer check for existence
    first, and the tests to longer mock-out the namespace
    exists check.

    Fix test_legacy_router_lifecycle failures

    Multi-path routes returned via the pyroute2 library have
    their outgoing interfaces in the 'multipath' dictionary
    element, not in the route dictionary. In that case return
    all the multipath routes correctly.

    Change-Id: I5415cb3a88ff2640a19598a1fcb2278388815343
    Closes-bug: #1795482
    Closes-bug: #1795548

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.0.0b1

This issue was fixed in the openstack/neutron 14.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/753255

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/753256

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.opendev.org/753256
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5ee6e5bb31179acf6e86357b0b0b34ef051af161
Submitter: Zuul
Branch: stable/queens

commit 5ee6e5bb31179acf6e86357b0b0b34ef051af161
Author: Brian Haley <email address hidden>
Date: Mon Oct 1 13:49:32 2018 -0400

    Do not fail deleting namespace if it does not exist

    Note: this is a squash of two changes since they are
    dependent on each other, and are currently blocking
    the gate queue.

    Sometimes cleanup methods are failing in the check and
    gate queues trying to delete non-existing namespaces.
    Since they could have been deleted asynchronously, don't
    raise if the failure is "No such file or directory" since
    the system is in the intended state.

    Cleaned-up the DHCP agent to longer check for existence
    first, and the tests to longer mock-out the namespace
    exists check.

    Fix test_legacy_router_lifecycle failures

    Multi-path routes returned via the pyroute2 library have
    their outgoing interfaces in the 'multipath' dictionary
    element, not in the route dictionary. In that case return
    all the multipath routes correctly.

    Change-Id: I5415cb3a88ff2640a19598a1fcb2278388815343
    Closes-bug: #1795482
    Closes-bug: #1795548

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/753255
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7e90c5fb73612d73b393a7aa0517fb17d81faf77
Submitter: Zuul
Branch: stable/rocky

commit 7e90c5fb73612d73b393a7aa0517fb17d81faf77
Author: Brian Haley <email address hidden>
Date: Mon Oct 1 13:49:32 2018 -0400

    Do not fail deleting namespace if it does not exist

    Note: this is a squash of two changes since they are
    dependent on each other, and are currently blocking
    the gate queue.

    Sometimes cleanup methods are failing in the check and
    gate queues trying to delete non-existing namespaces.
    Since they could have been deleted asynchronously, don't
    raise if the failure is "No such file or directory" since
    the system is in the intended state.

    Cleaned-up the DHCP agent to longer check for existence
    first, and the tests to longer mock-out the namespace
    exists check.

    Fix test_legacy_router_lifecycle failures

    Multi-path routes returned via the pyroute2 library have
    their outgoing interfaces in the 'multipath' dictionary
    element, not in the route dictionary. In that case return
    all the multipath routes correctly.

    Change-Id: I5415cb3a88ff2640a19598a1fcb2278388815343
    Closes-bug: #1795482
    Closes-bug: #1795548

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron queens-eol

This issue was fixed in the openstack/neutron queens-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol

This issue was fixed in the openstack/neutron rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.