The VIM is leaking FDs. The problem happens as follows:
- The VIM has worker processes that are used to communicate with
other processes through their REST APIs (e.g. sysinv, nova,
cinder). The VIM does not specify a timeout when sending REST API
requests.
- The VIM does have a timeout for how long a worker process takes to
process a request, which can vary depending on the request.
- If the worker process sends a REST API request and does not get a
response in time (e.g. because a message is lost or the target
process is down), the VIM terminates the worker process. This is
being done with a call to Process.terminate in the python
multiprocessing library. The docs for this library clearly indicate
that Process.terminate should not be used for a process that uses
any shared resources (e.g. pipes). In this case, the worker
processes are using shared resources (pipes for one) and these
resources are not freed, leading to the FD leak.
The solution is to ensure that a timeout is set when sending REST API
requests. This timeout must be less than the worker timeout to ensure
that the workers do not timeout (and leak FDs) except in the rarest
of cases.
Reviewed: https:/ /review. opendev. org/733246 /git.openstack. org/cgit/ starlingx/ nfv/commit/ ?id=ccd59a07116 d676b645831047d f3d0b77db4a0cc
Committed: https:/
Submitter: Zuul
Branch: master
commit ccd59a07116d676 b645831047df3d0 b77db4a0cc
Author: Bart Wensley <email address hidden>
Date: Wed Jun 3 11:37:56 2020 -0500
Handle REST API timeouts gracefully in the VIM
The VIM is leaking FDs. The problem happens as follows: essing library. The docs for this library clearly indicate
- The VIM has worker processes that are used to communicate with
other processes through their REST APIs (e.g. sysinv, nova,
cinder). The VIM does not specify a timeout when sending REST API
requests.
- The VIM does have a timeout for how long a worker process takes to
process a request, which can vary depending on the request.
- If the worker process sends a REST API request and does not get a
response in time (e.g. because a message is lost or the target
process is down), the VIM terminates the worker process. This is
being done with a call to Process.terminate in the python
multiproc
that Process.terminate should not be used for a process that uses
any shared resources (e.g. pipes). In this case, the worker
processes are using shared resources (pipes for one) and these
resources are not freed, leading to the FD leak.
The solution is to ensure that a timeout is set when sending REST API
requests. This timeout must be less than the worker timeout to ensure
that the workers do not timeout (and leak FDs) except in the rarest
of cases.
Change-Id: Iccff914e86224b e96689738cdcc53 6a4d5acb861
Closes-Bug: 1862049
Signed-off-by: Bart Wensley <email address hidden>