httpd leaks open files

Bug #1780164 reported by Mohamed Aly
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
OpenStack Dashboard (Horizon)
Confirmed
Undecided
Unassigned
keystoneauth
Won't Fix
Undecided
Unassigned
python-keystoneclient
Confirmed
Medium
Unassigned

Bug Description

horizon version 13.0.0-1.el7 (Queens) on centos 7.4.1708

After some time working on the dashboard, it stops working and throw this error in the error log

[Wed Jul 04 22:49:33.744241 2018] [:error] [pid 23924] [remote 10.144.187.237:52] IOError: [Errno 24] Too many open files: '/usr/share/openstack-dashboard/openstack_dashboard/templates/500.html'

if we check the open files of this process 23924

 ls -l /proc/23924/fd | wc -l
1023

even if we increase the nofile limit of this process, it doesn't help as the open files is always increasing
the problem is cleared if we restart the httpd process, but then the open files will increase again

how to reproduce

1- login to the dashboard
2- get the pid from the error_log file
[Wed Jul 04 22:50:58.620832 2018] [:error] [pid 23924] INFO openstack_auth.forms Login successful for user "demo" using domain "default"
3- browse the dashboard different menus, specially the network topology tab
4- monitor the open files no. with ls -l /proc/<pid>/fd | wc -l

observation
the no. is always increasing

Mohamed Aly (makhalil)
description: updated
Revision history for this message
Martin Chlumsky (martin-chlumsky) wrote :

We have also noticed this problem.
We did an strace on horizon and found that the connections made by horizon to keystone were not getting closed.

Digging deeper, we found 2 very similar commits that fix connection closure issues related to keystone.

https://github.com/openstack/python-keystoneclient/commit/8fcacdc7c74f5ac68e8e55ea8c15918c452411fe

and

https://github.com/openstack/keystoneauth/commit/dbcbf414ac8423e97d77d0bda8157be5350530f0

I think that the commit in keystoneauth is incomplete, it's missing the destination of the move of the _FakeRequestSession class and so it's just a removal.

In Newton, horizon is using keystoneclient to get keystone sessions and we don't see the connection leaks.
Somewhere between Newton and Queens, horizon switched to keystoneauth and lost the _FakeRequestSession hack and so we are now seeing a regression where horizon is leaking connections again.

I am attaching a patch that we are currently testing, it re-introduces the _FakeRequestSession class into keystoneauth.
The patch is a little naive, I don't know what to do with the instance attribute "adapters" of _FakeRequestSession, I just set it to an empty list because it is being referenced by code elsewhere. It seems to be working for now.

Changed in horizon:
status: New → Confirmed
Changed in keystoneauth:
status: New → Confirmed
Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

We should fix keystoneclient. KeystoneAuth is not doing anything wrong here. I am against a "temp hack" like this. Secondarily, please submit this patch to gerrit at review.openstack.org so that it can be considered. Posting patches here is unlikely to be seen as easily/readily.

Changed in keystoneauth:
status: Confirmed → Won't Fix
Changed in python-keystoneclient:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Martin Chlumsky (martin-chlumsky) wrote :

Thanks for looking at this.

For the record, I also don't like this hack. However, horizon is using keystoneauth and not keystoneclient so I don't see how fixing keystoneclient will help here. I still think it's keystoneauth that needs to be fixed.

Also, I hesitate to submit the patch to gerrit as it makes the tests fail when I run tox.

Revision history for this message
John-Paul Robinson (jprorama) wrote :

@mchlumsky thanks for including the patch for session.py. We are able to use it in a packaged Rocky deploy and successfully overcome the file descriptor leak associated with the keyauth connections. This successfully resolved our too many open files error being caused by unbounded growth of open files due to lack of close() after connect()'s to keystone.

Revision history for this message
Guang Yee (guang-yee) wrote :

I am also seeing the same issue, that the /proc/<memcached pid>/fd/ count increasing with each Horizon login. I agree we need to fix keystoneauth.

Revision history for this message
Jordan Callicoat (jcallicoat) wrote :

I've seen this issue in multiple production Rocky environments.

I found a commit in keystoneauth that properly closes requests sessions (which closes the socket/file handle):

https://opendev.org/openstack/keystoneauth/commit/b2b5ad3cb1ff05e08c22973bb079125214ba7bcf

This commit is included in keystoneauth1==3.18.0. I have tested upgrading to 3.18.0 and initial outlook is good--I'm seeing apache workers open new sockets/file handles and then close them as expected.

Will report back once I have deployed 3.18.0 in more environments and it has been running for a while.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.