Brief Description
-----------------
After many application operations (upload, apply, reapply, remove, delete, repeat x50) the sysinv conductor runs out of sockets and the sysinv log fills with exceptions, many sysinv commands start failing.
Severity
--------
Minor: System/Feature is usable with minor issue
Could lock/unlock host to fix it
Steps to Reproduce
------------------
Perform many repeated application operations
Expected Behavior
------------------
Sysinv-conductor should remain operational
Actual Behavior
----------------
System commands start being rejected, sysinv.log filled with exceptions (see below):
Reproducibility
---------------
Seen once so far
System Configuration
--------------------
Multi-node system, Dedicated storage, IPv4
Branch/Pull Time/Commit
-----------------------
###
### StarlingX
### Built from master
###
OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190720T013000Z"
JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="186"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-07-20 01:30:00 +0000"
app tarball version:
stx-openstack-1.0-17-centos-stable-versioned.tgz
Timestamp/Logs
--------------
2019-07-26 17:32:50.587 524584 ERROR sysinv.openstack.common.rpc.amqp [req-c79048cd-8642-4823-9f79-eaab631a6e48 None None] Exception during message handling
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp Traceback (most recent call last):
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/amqp.py", line 438, in _process_data
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp **args)
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 8205, in ilvg_get_nova_ilvg_by_ihost
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp ilvgs = self.dbapi.ilvg_get_by_ihost(ihost_uuid)
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/objects/__init__.py", line 102, in wrapper
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp result = fn(*args, **kwargs)
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/db/sqlalchemy/api.py", line 3128, in ilvg_get_by_ihost
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp sort_key, sort_dir, query)
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/db/sqlalchemy/api.py", line 119, in _paginate_query
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp return query.all()
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2703, in all
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2855, in __iter__
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2876, in _execute_and_instances
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2885, in _get_bind_args
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2867, in _connection_from_session
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 998, in connection
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1005, in _connection_for_bind
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2112, in contextual_connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2151, in _wrap_pool_connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1461, in _handle_dbapi_exception_noconnection
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2147, in _wrap_pool_connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 387, in connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 766, in _checkout
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 516, in checkout
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1138, in _do_get
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 1135, in _do_get
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 333, in _create_connection
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 461, in __init__
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 651, in __connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 105, in connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 393, in connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/psycopg2/__init__.py", line 164, in connect
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp OperationalError: (psycopg2.OperationalError) could not create socket: Too many open files
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp
2019-07-26 17:32:50.587 524584 TRACE sysinv.openstack.common.rpc.amqp
2019-07-26 17:32:50.589 524584 ERROR sysinv.openstack.common.rpc.common [req-c79048cd-8642-4823-9f79-eaab631a6e48 None None] Returning exception (psycopg2.OperationalError) could not create socket: Too many open files
to caller
Seems the max open files is set to 1024
cat /proc/1914392/limits
Limit Soft Limit Hard Limit Units
...
Max open files 1024 4096 files
and sysinv-conductor has hit that limit:
ls -l /proc/524584/fd | wc -l
1025
vast majority of which are sockets:
...
lrwx------ 1 root root 64 Jul 26 15:02 974 -> socket:[106351332]
lrwx------ 1 root root 64 Jul 26 15:02 975 -> socket:[106351333]
lrwx------ 1 root root 64 Jul 26 15:02 976 -> socket:[106406580]
...
Test Activity
-------------
Developer Testing
Marking as medium priority / stx.2.0 - issue happens as a result of repetitive/stress testing; recoverable by manual action.