Comment 8 for bug 1862049

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nfv (master)

Reviewed: https://review.opendev.org/733246
Committed: https://git.openstack.org/cgit/starlingx/nfv/commit/?id=ccd59a07116d676b645831047df3d0b77db4a0cc
Submitter: Zuul
Branch: master

commit ccd59a07116d676b645831047df3d0b77db4a0cc
Author: Bart Wensley <email address hidden>
Date: Wed Jun 3 11:37:56 2020 -0500

    Handle REST API timeouts gracefully in the VIM

    The VIM is leaking FDs. The problem happens as follows:
    - The VIM has worker processes that are used to communicate with
      other processes through their REST APIs (e.g. sysinv, nova,
      cinder). The VIM does not specify a timeout when sending REST API
      requests.
    - The VIM does have a timeout for how long a worker process takes to
      process a request, which can vary depending on the request.
    - If the worker process sends a REST API request and does not get a
      response in time (e.g. because a message is lost or the target
      process is down), the VIM terminates the worker process. This is
      being done with a call to Process.terminate in the python
      multiprocessing library. The docs for this library clearly indicate
      that Process.terminate should not be used for a process that uses
      any shared resources (e.g. pipes). In this case, the worker
      processes are using shared resources (pipes for one) and these
      resources are not freed, leading to the FD leak.

    The solution is to ensure that a timeout is set when sending REST API
    requests. This timeout must be less than the worker timeout to ensure
    that the workers do not timeout (and leak FDs) except in the rarest
    of cases.

    Change-Id: Iccff914e86224be96689738cdcc536a4d5acb861
    Closes-Bug: 1862049
    Signed-off-by: Bart Wensley <email address hidden>