python-keystoneclient

keystone shuts down it's own listening socket with 'too many files open'

Bug #1179709 reported by Robert Collins on 2013-05-13

This bug report is a duplicate of: Bug #1282089: keystone client is leaving hanging connections to the server. Edit Remove

This bug affects 5 people

	Status	Importance	Assigned to
OpenStack Dashboard (Horizon)	New	Undecided	Unassigned
OpenStack Identity (keystone)	New	High	Unassigned
python-keystoneclient	New	Undecided	Unassigned

Bug Description

We're running with mysql - I can grab a rendered config file for you if needed.

Here is what I see in the log:
Traceback (most recent call last):
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 97, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/eventlet/greenthread.py", line 194, in main
    result = function(*args, **kwargs)
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/keystone/common/wsgi.py", line 135, in _run
    log=WritableLogger(log))
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/eventlet/wsgi.py", line 663, in server
    client_socket = sock.accept()
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/eventlet/greenio.py", line 166, in accept
    res = socket_accept(fd)
  File "/opt/stack/venvs/keystone/local/lib/python2.7/site-packages/eventlet/greenio.py", line 56, in socket_accept
    return descriptor.accept()
  File "/usr/lib/python2.7/socket.py", line 202, in accept
    sock, addr = self._sock.accept()
error: [Errno 24] Too many open files
Removing descriptor: 5

keystone-all is still running, netstat -anp shows:
tcp 0 0 0.0.0.0:35357 0.0.0.0:* LISTEN 25530/python
unix 3 [ ] STREAM CONNECTED 44834812 25530/python

restarting it gets me:
root@ubuntu:~# netstat -anp | grep 25267
tcp 0 0 0.0.0.0:35357 0.0.0.0:* LISTEN 25267/python
tcp 0 0 0.0.0.0:5000 0.0.0.0:* LISTEN 25267/python
unix 3 [ ] STREAM CONNECTED 44848565 25267/python

Which is rather more useful.

Tags:

Revision history for this message

Robert Collins (lifeless) wrote on 2013-05-13:

Oh, and this happened on an idle cloud sitting there with no requests being sent to it...

Dolph Mathews (dolph) on 2013-06-07

Changed in keystone:
status:	New → Triaged
importance:	Undecided → High

Revision history for this message

Matthieu Huin (mhu-s) wrote on 2013-06-12:

Hello,

What is the output of the following commands :

cat /proc/sys/fs/file-max
ulimit -Sn
ulimit -Hn

Your system might be undersized in terms of file descriptors that can be opened.

Revision history for this message

Robert Collins (lifeless) wrote on 2013-06-12:

Well, possibly, but even so - closing *other* FD's should happen first, no?
cat /proc/sys/fs/file-max
8196860
root@foo:~# ulimit -Sn
1024
root@foo:~# ulimit -Hn
4096

Seems like oodles of fd's for a one-node bare metal cloud with approximately no load.

Dolph Mathews (dolph) on 2013-06-12

tags:

added: grizzly-backport-potential

Revision history for this message

Matthieu Huin (mhu-s) wrote on 2013-06-14:

Robert, you might want to increase manually the file-max parameters, see for example http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/

Revision history for this message

David Stanek (dstanek) wrote on 2013-10-03:

Can you use lsof to get a list of the open file handles? That will help us to understand if there is a file descriptor leak.

Revision history for this message

David Stanek (dstanek) wrote on 2013-11-05:

In chat on 10/3 Robert mentioned that he hasn't hit this issue in months.

Revision history for this message

Adam Young (ayoung) wrote on 2013-11-19:

Leaving in incomplete state until it reappears or we decide to close

Changed in keystone:
status:	Triaged → Incomplete

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-01-19:

[Expired for Keystone because there has been no activity for 60 days.]

Changed in keystone:
status:	Incomplete → Expired

Revision history for this message

John Griffith (john-griffith) wrote on 2014-02-26:

#10

Seems we hit this today on a Havana build. Nothing really going on with the system... went to dinner, user came back and couldn't log in. I logged on to controller and noticed traces in keystone, restarted services and back in business, but not sure what's actually causing this.

Changed in keystone:
status:	Expired → New

Revision history for this message

Florent Flament (florentflament) wrote on 2014-03-04:

#11

It seems that there may be some sockets leakage with Apache / Horizon. After some investigation playing with lsof, I found out that apache2 keeps establishing new connexions towards Keystone and other services without closing them.

On my devstack, there isn't any spawned VMs, nor any other resources used. lsof tells me that there are 25 TCP connexions estatblished from apache2 to Keystone and 109 connexions from apache2 to nova-api.

$ sudo lsof -a -c apache2 -i :5000 | wc -l
25
$ sudo lsof -a -c apache2 -i :8774 | wc -l
109

After launching a VM and destroying it, through the Horizon dashboard, lsof reports 29 TCP connexions established between apache2 and Keystone and 128 between apache2 and nova-api.

$ sudo lsof -a -c apache2 -i :5000 | wc -l
29
$ sudo lsof -a -c apache2 -i :8774 | wc -l
128

It looks like Horizon opens sockets that it doesn't close. At some point, there may be too many sockets opened on the Horizon server and/or Keystone and/or any other OpenStack service.

I didn't observe the issue by using the CLI.

I guess that further investigation would be required on how Horizon manages open connexions.

Revision history for this message

mouadino (mouadino) wrote on 2014-03-14:

#12

I think this is related to this bug https://bugs.launchpad.net/python-keystoneclient/+bug/1282089

Revision history for this message

mouadino (mouadino) wrote on 2014-03-14:

#13

Basically what happen is the following:

In each request sent by keystoneclient to the server the client will create a requests.Session (python-requests) for it, and because python-requests Sessions are using a connection pool under the hood with the help of http 1.1 keep alive, this connections doesn't get closed after each request instead they are kept open for re-use (Which is great) but the problem is that they will never be re-used because like i said keystoneclient create a requests.Session (a connections pool) of each requests, so you end up with a lot of opened socket which neither the server will close (because keystone server doesn't have a keep alive timeout which is bad), nor the client (at least until the Python GC claim them).

So at one point you reach the limit of opened file of the server and your server just hang there until restart.

I did propose of fix for keystoneclient that no one liked, but for people that use apached/nginx as reverse proxy to keystone server i was thinking that they should be able to configure the keepalive timeout to not allow close this connections but it's a hack, the real fix IMHO will be to fix keystoneclient not horizon nor nova ... .

HTH,

Revision history for this message

Julie Pichon (jpichon) wrote on 2014-03-14:

#14

Bug 1247056 which handled a similar issue with the nova client may also be of interest.

Revision history for this message

Florent Flament (florentflament) wrote on 2014-03-17:

#15

mouadino, julie, nice catch.

Looks like this bug is a duplicate of bug 1232089 ( https://bugs.launchpad.net/python-keystoneclient/+bug/1282089 ).

Revision history for this message

Florent Flament (florentflament) wrote on 2014-03-17:

#16

Sorry, wanted to say bug 1282089.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1282089 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.