os_vif error: [Errno 24] Too many open files

Bug #1807949 reported by sean mooney
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
os-vif
Fix Released
High
Rodolfo Alonso

Bug Description

There is a possible file handel leak in pyroute2

This is a generic tracking bug for this error.

pyroute 2 has fixed several such leaks in the past and they are mostly due to python 2 issues
it is possible that these leaks only affect some releases as such we should consider blacklistig
the effected relase until we move to python3 only. the root casuse of these leak normally has been
a defect in how the low level python2 libs that pyroute2 cleanup the os sockets which is change in python3 and nolonger causes these issue but that is not always the case.

Tags: ovs pyroute2
Revision history for this message
Nate Johnston (nate-johnston) wrote :

In case a concrete example is desired:

Dec 11 15:07:13.963832 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif [None req-6075af98-f76b-4f27-a69e-73a4f997885a tempest-AttachInterfacesTestJSON-492002961 tempest-AttachInterfacesTestJSON-492002961] Failed to plug vif VIFBridge(active=False,address=fa:16:3e:df:e9:11,bridge_name='qbr6d01fb20-16',has_traffic_filtering=True,id=6d01fb20-1652-4706-b8c3-d4d7de6a44cd,network=Network(9365e488-0327-45e7-baa3-594ba64535b4),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap6d01fb20-16'): OSError: [Errno 24] Too many open files
Dec 11 15:07:13.964416 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif Traceback (most recent call last):
Dec 11 15:07:13.964680 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/os_vif/__init__.py", line 77, in plug
Dec 11 15:07:13.964926 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif plugin.plug(vif, instance_info)
Dec 11 15:07:13.965169 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/vif_plug_ovs/ovs.py", line 263, in plug
Dec 11 15:07:13.965401 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif self._plug_bridge(vif, instance_info)
Dec 11 15:07:13.965647 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/vif_plug_ovs/ovs.py", line 210, in _plug_bridge
Dec 11 15:07:13.965897 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif linux_net.ensure_bridge(vif.bridge_name)
Dec 11 15:07:13.966138 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/oslo_privsep/priv_context.py", line 207, in _wrap
Dec 11 15:07:13.966379 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif return self.channel.remote_call(name, args, kwargs)
Dec 11 15:07:13.970795 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif File "/usr/local/lib/python2.7/dist-packages/oslo_privsep/daemon.py", line 202, in remote_call
Dec 11 15:07:13.971240 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif raise exc_type(*result[2])
Dec 11 15:07:13.971602 ubuntu-xenial-rax-iad-0001126348 nova-compute[31767]: ERROR os_vif OSError: [Errno 24] Too many open files

citation: http://logs.openstack.org/75/623275/10/check/neutron-tempest-iptables_hybrid/e1ae127/logs/screen-n-cpu.txt.gz?level=ERROR#_Dec_11_15_07_13_963832

Revision history for this message
Nate Johnston (nate-johnston) wrote :

Failures affecting the neutron project seem to be isolated to the neutron-tempest-iptables_hybrid CI job.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Just to see on logstash:
message:"OSError: [Errno 24] Too many open files" AND module:"os_vif" AND filename:"logs/screen-n-cpu.txt"

Revision history for this message
sean mooney (sean-k-mooney) wrote :

yes i wrote an elastic recheck query for this yesterday but it has not merged yet.
https://review.openstack.org/#/c/624412/
this is a resource leaks so there wont actually be any back traces that show the root cause.

it is highly likely that it is cause by one of the function in
https://github.com/openstack/os-vif/blob/master/os_vif/internal/command/ip/linux/impl_pyroute2.py
but i have not determined the root cause as of yet.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hi Sean:

I tested several times modifying the functional tests and the code, adding loops to generate more calls to pyroute2.

I found the issue is when we are calling to iproute.IPRoute(). This object should be called inside a context, in order to call:
    def __exit__(self, exc_type, exc_value, traceback):
        self.close()

I'll submit a patch now.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-vif (master)

Fix proposed to branch: master
Review: https://review.openstack.org/624831

Changed in os-vif:
assignee: sean mooney (sean-k-mooney) → Rodolfo Alonso (rodolfo-alonso-hernandez)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-vif (master)

Reviewed: https://review.openstack.org/624831
Committed: https://git.openstack.org/cgit/openstack/os-vif/commit/?id=db5216357b1be93d91aa48b2878599f2dfef02a8
Submitter: Zuul
Branch: master

commit db5216357b1be93d91aa48b2878599f2dfef02a8
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Dec 12 23:25:25 2018 +0000

    Create iproute.IPRoute() inside a context

    IPRoute() object must be closed once is no longer used. This is done
    automatically calling it inside its own context. If IPRoute() object
    is not closed, the following error is raised after many calls:
      OSError: [Errno 24] Too many open files

    Change-Id: I37ee291c22c2c3933ee229bfc939a4481898626c
    Closes-Bug: #1807949

Changed in os-vif:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-vif 1.13.1

This issue was fixed in the openstack/os-vif 1.13.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.