Servicemon/SmbChecker runs out of available file descriptors

Bug #545029 reported by Morten Brekkevold
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Fix Released
Medium
Kai Bjørnenak

Bug Description

A bug in the servicemon checker plugin SmbChecker causes the servicemon daemon to keep opening new UNIX pipes, until it hits the system limit on the number of open files. This may also affect the rest of the processes running as the navcron user, as the limit is often per-user.

This bug only appears when an smb service handler has been added.

Tags: servicemon
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

I propose the attached patch as a fix for this, but I will need to test it some more first.

Changed in nav:
assignee: nobody → Morten Brekkevold (mvold)
status: New → Confirmed
status: Confirmed → In Progress
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

I have investigated this further. This actually consists of two issues. The minor issue is the typo in the SmbChecker code, but is actually not this typo that is causing the number of open pipes to increase indefinitely.

This seems to be a known bug in Python, see http://bugs.python.org/issue5179 . Unfortunately for us, it has only been fixed in Python 2.6.

IOW, this bug will only affect you if you are running Python < 2.6 and enable the Samba service checker without having smbclient installed. Apart from fixing the typo, it seems the best we can do is write a workaround for the Python bug - something akin to verifying that there is actually an executable smbclient present before attempting to run it.

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

Re-assigning this to Ole Martin, since he's the one mainting the service monitor nowadays :)

Changed in nav:
assignee: Morten Brekkevold (mvold) → Ole Martin Bjørndalen (olemb)
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

Re-assigning this to Ole Martin, since he's the one maintaining the service monitor nowadays :)

Changed in nav:
milestone: 3.5.6 → 3.5.7
Revision history for this message
Ole Martin Bjørndalen (olemb) wrote :

I've written a find_cmd() for the new DHCP checker which can be moved to a library and used to fix this bug:

https://heimdal.uit.no/hg/olemb/dhcpchecker/file/tip/subsystem/statemon/nav/statemon/checker/DhcpChecker.py

Revision history for this message
Ole Martin Bjørndalen (olemb) wrote :

Just not sure where to put it.

Changed in nav:
status: In Progress → Fix Committed
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

Ole Martin, I am unable to find your bugfix for the 3.5.x branch? The link to your repository gives my a 500 internal server error...

Changed in nav:
status: Fix Committed → In Progress
Changed in nav:
assignee: Ole Martin Bjørndalen (olemb) → Kai Bjørnenak (kai-bjornenak)
Revision history for this message
Kai Bjørnenak (kai-bjornenak) wrote :
Changed in nav:
milestone: 3.5.7 → 3.7.2
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

This fix seems to depend on a whole range of changes Ole Martin has made to clean the statemon code. I'm therefore hesitant to make this part of NAV 3.7. I've merged the branch to the default branch, and am therefore retargeting this to NAV 3.8.0.

Merged changeset here: http://metanav.uninett.no/hg/default/rev/e3fabef80e8b

Changed in nav:
milestone: 3.7.2 → 3.8.0
status: In Progress → Fix Committed
Changed in nav:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.