Leaking file descriptors after a while

Bug #1872849 reported by Flavio Fernandes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mortoray.com
Fix Released
Undecided
Unassigned

Bug Description

I'm using https://pypi.org/project/shelljob/ to periodically spawn jobs.
The project code lives here:
https://github.com/flavio-fernandes/mqtt2cmd

In particular, this is the logic that kicks off jobs:
https://github.com/flavio-fernandes/mqtt2cmd/blob/37f8d0788b9a019eb33e7deb3753b2b94f472320/mqtt2cmd/dispatcher.py#L184-L199

Looking at the system (Raspberry PI) after a while clearly shows tons of open FDs for "pipe":

I suspect I'm either not using shelljob right, or there is a bug on how it handles closing Group.
For any other info I can provide, please let me know!

root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# systemctl status mqtt2cmd
● mqtt2cmd.service - MQTT based command dispatcher
   Loaded: loaded (/lib/systemd/system/mqtt2cmd.service; enabled)
   Active: active (running) since Sun 2020-04-12 21:52:14 EDT; 1 day 20h ago
 Main PID: 16101 (start_mqtt2cmd.)
   CGroup: /system.slice/mqtt2cmd.service
           ├─16101 /bin/bash /home/pi/mqtt2cmd.git/mqtt2cmd/bin/start_mqtt2cmd.sh /home/pi/mqtt2cmd.git/data/config.yaml.flavio...
           ├─16106 python ./main.py /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
           ├─16110 python ./main.py /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
           └─16111 python ./main.py /home/pi/mqtt2cmd.git/data/config.yaml.flaviof

Apr 12 21:52:14 oclock systemd[1]: Started MQTT based command dispatcher.
root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | wc -l
1060
root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | head -10
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 16111 pi cwd DIR 179,2 4096 252943 /home/pi/mqtt2cmd.git/mqtt2cmd
python 16111 pi rtd DIR 179,2 4096 2 /
python 16111 pi txt REG 179,2 3821024 253051 /home/pi/mqtt2cmd.git/env/bin/python
python 16111 pi mem REG 179,2 14440 139818 /usr/lib/python3.4/lib-dynload/_multiprocessing.cpython-34m-arm-linux-gnueabihf.so
python 16111 pi mem REG 179,2 76296 139829 /usr/lib/python3.4/lib-dynload/parser.cpython-34m-arm-linux-gnueabihf.so
python 16111 pi mem REG 179,2 124592 1920 /lib/arm-linux-gnueabihf/liblzma.so.5.0.0
python 16111 pi mem REG 179,2 26648 139816 /usr/lib/python3.4/lib-dynload/_lzma.cpython-34m-arm-linux-gnueabihf.so
python 16111 pi mem REG 179,2 13996 1974 /lib/arm-linux-gnueabihf/libuuid.so.1.3.0
python 16111 pi mem REG 179,2 66008 3747 /lib/arm-linux-gnueabihf/libbz2.so.1.0.4

...

root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | tail -10
python 16111 pi 1010w FIFO 0,10 0t0 3436193 pipe
python 16111 pi 1011w FIFO 0,10 0t0 3437223 pipe
python 16111 pi 1012w FIFO 0,10 0t0 3438246 pipe
python 16111 pi 1013w FIFO 0,10 0t0 3439286 pipe
python 16111 pi 1014w FIFO 0,10 0t0 3440319 pipe
python 16111 pi 1015w FIFO 0,10 0t0 3441347 pipe
python 16111 pi 1016w FIFO 0,10 0t0 3442390 pipe
python 16111 pi 1017w FIFO 0,10 0t0 3443426 pipe
python 16111 pi 1018w FIFO 0,10 0t0 3444457 pipe
python 16111 pi 1019w FIFO 0,10 0t0 3445501 pipe
root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin#

root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | grep -Ec " pipe$"
1013

Revision history for this message
Flavio Fernandes (ffernand) wrote :

I was able to narrow down to a change that makes this issue go away.

In subprocess.Popen:

file proc.py:

    def _run_impl(self, cmd, shell):
        cmd = _expand_cmd(cmd)

        handle = subprocess.Popen(cmd,
                                  shell=shell,
                                  bufsize=1,
                                  stdout=subprocess.PIPE,
                                  stderr=subprocess.STDOUT,
                                  # needed to detach from calling terminal (other wacky things
                                  # can happen)
                                  stdin=subprocess.DEVNULL, <===== THIS MATTERS!!!!
                                  close_fds=True, <===== sholud do this too
                                  )

The close_fds is also a good idea to use. See:

https://docs.python.org/3.4/library/subprocess.html#subprocess.Popen

popen2 closes all file descriptors by default, but you have to specify close_fds=True with Popen to guarantee this behavior on all platforms or past Python versions.

Revision history for this message
edA-qa (eda-qa) wrote :

Hmm, there are projects that need the stdin to not be null, so this would need to be configurable.

I've never tried this for long-running processes. Let me look if I'm not cleaning up correctly before forcing the stdin to DEVNULL. (Though if you'd like an option for that, okay).

Revision history for this message
edA-qa (eda-qa) wrote :

It's fairly easy to reproduce the problem, so let me find a solution.

Revision history for this message
edA-qa (eda-qa) wrote :

Adding a handle.stdin.close() seems to fix the problem. I'd like to leave it at this minimal fix for now. I'll try to get a new version uploaded.

Revision history for this message
Flavio Fernandes (ffernand) wrote : Re: [Bug 1872849] Re: Leaking file descriptors after a while
Download full text (4.1 KiB)

👍

On Mon, Apr 20, 2020 at 3:11 PM edA-qa <email address hidden> wrote:

> It's fairly easy to reproduce the problem, so let me find a solution.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1872849
>
> Title:
> Leaking file descriptors after a while
>
> Status in mortoray.com:
> New
>
> Bug description:
>
> I'm using https://pypi.org/project/shelljob/ to periodically spawn jobs.
> The project code lives here:
> https://github.com/flavio-fernandes/mqtt2cmd
>
>
> In particular, this is the logic that kicks off jobs:
>
> https://github.com/flavio-fernandes/mqtt2cmd/blob/37f8d0788b9a019eb33e7deb3753b2b94f472320/mqtt2cmd/dispatcher.py#L184-L199
>
> Looking at the system (Raspberry PI) after a while clearly shows tons
> of open FDs for "pipe":
>
> I suspect I'm either not using shelljob right, or there is a bug on how
> it handles closing Group.
> For any other info I can provide, please let me know!
>
>
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# systemctl status
> mqtt2cmd
> ● mqtt2cmd.service - MQTT based command dispatcher
> Loaded: loaded (/lib/systemd/system/mqtt2cmd.service; enabled)
> Active: active (running) since Sun 2020-04-12 21:52:14 EDT; 1 day 20h
> ago
> Main PID: 16101 (start_mqtt2cmd.)
> CGroup: /system.slice/mqtt2cmd.service
> ├─16101 /bin/bash
> /home/pi/mqtt2cmd.git/mqtt2cmd/bin/start_mqtt2cmd.sh
> /home/pi/mqtt2cmd.git/data/config.yaml.flavio...
> ├─16106 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
> ├─16110 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
> └─16111 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
>
> Apr 12 21:52:14 oclock systemd[1]: Started MQTT based command dispatcher.
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | wc -l
> 1060
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | head -10
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> python 16111 pi cwd DIR 179,2 4096 252943
> /home/pi/mqtt2cmd.git/mqtt2cmd
> python 16111 pi rtd DIR 179,2 4096 2 /
> python 16111 pi txt REG 179,2 3821024 253051
> /home/pi/mqtt2cmd.git/env/bin/python
> python 16111 pi mem REG 179,2 14440 139818
> /usr/lib/python3.4/lib-dynload/_
> multiprocessing.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 76296 139829
> /usr/lib/python3.4/lib-dynload/parser.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 124592 1920
> /lib/arm-linux-gnueabihf/liblzma.so.5.0.0
> python 16111 pi mem REG 179,2 26648 139816
> /usr/lib/python3.4/lib-dynload/_lzma.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 13996 1974
> /lib/arm-linux-gnueabihf/libuuid.so.1.3.0
> python 16111 pi mem REG 179,2 66008 3747
> /lib/arm-linux-gnueabihf/libbz2.so.1.0.4
>
> ...
>
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | tail -10
> ...

Read more...

Revision history for this message
edA-qa (eda-qa) wrote :

I've uploaded a new version. See if it fixes the problem.

Unfortunately I can't get the sphinx docs and readme building a again. :/

edA-qa (eda-qa)
Changed in mortoray.com:
status: New → Fix Released
Revision history for this message
Flavio Fernandes (ffernand) wrote :
Download full text (4.2 KiB)

Awesome!

Will do. Please give me till the weekend as I may not find time to
play with it before then.

Thanks,

-- flaviof

On Mon, Apr 20, 2020 at 4:00 PM edA-qa <email address hidden> wrote:

> I've uploaded a new version. See if it fixes the problem.
>
> Unfortunately I can't get the sphinx docs and readme building a again.
> :/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1872849
>
> Title:
> Leaking file descriptors after a while
>
> Status in mortoray.com:
> New
>
> Bug description:
>
> I'm using https://pypi.org/project/shelljob/ to periodically spawn jobs.
> The project code lives here:
> https://github.com/flavio-fernandes/mqtt2cmd
>
>
> In particular, this is the logic that kicks off jobs:
>
> https://github.com/flavio-fernandes/mqtt2cmd/blob/37f8d0788b9a019eb33e7deb3753b2b94f472320/mqtt2cmd/dispatcher.py#L184-L199
>
> Looking at the system (Raspberry PI) after a while clearly shows tons
> of open FDs for "pipe":
>
> I suspect I'm either not using shelljob right, or there is a bug on how
> it handles closing Group.
> For any other info I can provide, please let me know!
>
>
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# systemctl status
> mqtt2cmd
> ● mqtt2cmd.service - MQTT based command dispatcher
> Loaded: loaded (/lib/systemd/system/mqtt2cmd.service; enabled)
> Active: active (running) since Sun 2020-04-12 21:52:14 EDT; 1 day 20h
> ago
> Main PID: 16101 (start_mqtt2cmd.)
> CGroup: /system.slice/mqtt2cmd.service
> ├─16101 /bin/bash
> /home/pi/mqtt2cmd.git/mqtt2cmd/bin/start_mqtt2cmd.sh
> /home/pi/mqtt2cmd.git/data/config.yaml.flavio...
> ├─16106 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
> ├─16110 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
> └─16111 python ./main.py
> /home/pi/mqtt2cmd.git/data/config.yaml.flaviof
>
> Apr 12 21:52:14 oclock systemd[1]: Started MQTT based command dispatcher.
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | wc -l
> 1060
> root@oclock:/home/pi/mqtt2cmd.git/mqtt2cmd/bin# lsof -p 16111 | head -10
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> python 16111 pi cwd DIR 179,2 4096 252943
> /home/pi/mqtt2cmd.git/mqtt2cmd
> python 16111 pi rtd DIR 179,2 4096 2 /
> python 16111 pi txt REG 179,2 3821024 253051
> /home/pi/mqtt2cmd.git/env/bin/python
> python 16111 pi mem REG 179,2 14440 139818
> /usr/lib/python3.4/lib-dynload/_
> multiprocessing.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 76296 139829
> /usr/lib/python3.4/lib-dynload/parser.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 124592 1920
> /lib/arm-linux-gnueabihf/liblzma.so.5.0.0
> python 16111 pi mem REG 179,2 26648 139816
> /usr/lib/python3.4/lib-dynload/_lzma.cpython-34m-arm-linux-gnueabihf.so
> python 16111 pi mem REG 179,2 13996 1974
> /lib/arm-linux-gnueabihf/libuuid.so.1.3.0
> pyth...

Read more...

Revision history for this message
Flavio Fernandes (ffernand) wrote :

I am not able to install the latest shelljob due to the following error:
ModuleNotFoundError: No module named 'pypandoc'

The full output is here:
     https://gist.github.com/flavio-fernandes/e3ef13a95bb159067beed00e0c31b85c

Can you point me to the commit you did in shelljob to fix the bug? I cannot locate it in Github.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.