Comment 5 for bug 1461863

Revision history for this message
Ryan Beisner (1chb1n) wrote :

We are periodically seeing this occur in 1.8.0~beta8+bzr3951-0ubuntu1~trusty1. Restarting the clusterd service gets us back in operation for a number of days. It seems to repeat.

The impact we observe is the inability to release or acquire nodes ('Releasing Failed' at the moment).

http://paste.ubuntu.com/11564921/

2015-06-04 12:31:51+0000 [-] Region not available: Couldn't bind: 24: Too many open files. (While requesting RPC info at http://10.245.168.2/MAAS/rpc/).
...
exceptions.OSError: [Errno 24] Too many open files: '/tmp/tmpvQtPf2'

# Foo
sudo lsof > lsof.txt

cat lsof.txt | grep tgt | wc -l
72109

cat lsof.txt | grep tgt | awk '{ print $3 }' | uniq | wc -l
802