This is what I see when I check if the issue is happening. I use `docker logs memcached`. I see some variation of this across all of the docker containers.
Just making sure, did you happen to deploy on CentOS 8? Just wondering if it is related to that.
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
getpeername: Transport endpoint is not connected
Failed to write, and not due to blocking: Broken pipe
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
Here is a Keystone example as soon as that starts to happen as well.
keystone-apache-public-error.log:2020-11-05 11:25:52.261790 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
keystone-apache-public-error.log:2020-11-05 11:25:52.261793 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')
keystone-apache-public-error.log:2020-11-05 11:25:52.261794 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines
keystone-apache-public-error.log:2020-11-05 11:25:52.261796 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines The above exception was the direct cause of the following exception:
keystone-apache-public-error.log:2020-11-05 11:25:52.261798 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines
keystone-apache-public-error.log:2020-11-05 11:25:52.261799 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
keystone-apache-public-error.log:2020-11-05 11:25:52.261801 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_db/sqlalchemy/engines.py", line 73, in _connect_ping_listener
keystone-apache-public-error.log:2020-11-05 11:25:52.261803 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines connection.scalar(select([1]))
keystone-apache-public-error.log:2020-11-05 11:25:52.261805 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 914, in scalar
keystone-apache-public-error.log:2020-11-05 11:25:52.261807 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines return self.execute(object_, *multiparams, **params).scalar()
keystone-apache-public-error.log:2020-11-05 11:25:52.261809 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 984, in execute
keystone-apache-public-error.log:2020-11-05 11:25:52.261813 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines return meth(self, multiparams, params)
keystone-apache-public-error.log:2020-11-05 11:25:52.261814 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
keystone-apache-public-error.log:2020-11-05 11:25:52.261816 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines return connection._execute_clauseelement(self, multiparams, params)
keystone-apache-public-error.log:2020-11-05 11:25:52.261818 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1103, in _execute_clauseelement
keystone-apache-public-error.log:2020-11-05 11:25:52.261820 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines distilled_params,
keystone-apache-public-error.log:2020-11-05 11:25:52.261822 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1288, in _execute_context
keystone-apache-public-error.log:2020-11-05 11:25:52.261824 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines e, statement, parameters, cursor, context
keystone-apache-public-error.log:2020-11-05 11:25:52.261826 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1479, in _handle_dbapi_exception
keystone-apache-public-error.log:2020-11-05 11:25:52.261828 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines util.raise_(newraise, with_traceback=exc_info[2], from_=e)
keystone-apache-public-error.log:2020-11-05 11:25:52.261830 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
keystone-apache-public-error.log:2020-11-05 11:25:52.261832 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines raise exception
keystone-apache-public-error.log:2020-11-05 11:25:52.261834 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
keystone-apache-public-error.log:2020-11-05 11:25:52.261836 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines cursor, statement, parameters, context
keystone-apache-public-error.log:2020-11-05 11:25:52.261837 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
keystone-apache-public-error.log:2020-11-05 11:25:52.261840 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines cursor.execute(statement, parameters)
keystone-apache-public-error.log:2020-11-05 11:25:52.261841 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/cursors.py", line 170, in execute
keystone-apache-public-error.log:2020-11-05 11:25:52.261843 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines result = self._query(query)
keystone-apache-public-error.log:2020-11-05 11:25:52.261845 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/cursors.py", line 328, in _query
keystone-apache-public-error.log:2020-11-05 11:25:52.261847 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines conn.query(q)
keystone-apache-public-error.log:2020-11-05 11:25:52.261849 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/connections.py", line 517, in query
keystone-apache-public-error.log:2020-11-05 11:25:52.261851 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines self._affected_rows = self._read_query_result(unbuffered=unbuffered)
keystone-apache-public-error.log:2020-11-05 11:25:52.261853 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/connections.py", line 732, in _read_query_result
keystone-apache-public-error.log:2020-11-05 11:25:52.261855 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines result.read()
keystone-apache-public-error.log:2020-11-05 11:25:52.261856 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/connections.py", line 1075, in read
keystone-apache-public-error.log:2020-11-05 11:25:52.261860 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines first_packet = self.connection._read_packet()
keystone-apache-public-error.log:2020-11-05 11:25:52.261861 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/connections.py", line 657, in _read_packet
keystone-apache-public-error.log:2020-11-05 11:25:52.261863 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines packet_header = self._read_bytes(4)
keystone-apache-public-error.log:2020-11-05 11:25:52.261865 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines File "/var/lib/kolla/venv/lib/python3.6/site-packages/pymysql/connections.py", line 707, in _read_bytes
keystone-apache-public-error.log:2020-11-05 11:25:52.261867 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
keystone-apache-public-error.log:2020-11-05 11:25:52.261869 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
keystone-apache-public-error.log:2020-11-05 11:25:52.261871 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines [SQL: SELECT 1]
keystone-apache-public-error.log:2020-11-05 11:25:52.261873 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines (Background on this error at: http://sqlalche.me/e/e3q8)
keystone-apache-public-error.log:2020-11-05 11:25:52.261877 2020-11-05 11:25:52.259 40 ERROR oslo_db.sqlalchemy.engines \x1b[00
I see this in mariadb logs
2020-11-05 23:59:55 102752 [Warning] Aborted connection 102752 to db: 'cinder' user: 'cinder' host: 'ctl-os2.hpc.siue.edu' (Got an error reading communication packets)
2020-11-05 23:59:57 102756 [Warning] Aborted connection 102756 to db: 'cinder' user: 'cinder' host: 'ctl-os2.hpc.siue.edu' (Got an error reading communication packets)
2020-11-05 23:59:59 102760 [Warning] Aborted connection 102760 to db: 'cinder' user: 'cinder' host: 'ctl-os2.hpc.siue.edu' (Got an error reading communication packets)
As I'm sure you can imagine once keystone hangs up due to the open file issue the cluster pretty much dies.
This is what I see when I check if the issue is happening. I use `docker logs memcached`. I see some variation of this across all of the docker containers.
Just making sure, did you happen to deploy on CentOS 8? Just wondering if it is related to that.
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
getpeername: Transport endpoint is not connected
Failed to write, and not due to blocking: Broken pipe
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
accept4(): Too many open files
Too many open connections
Here is a Keystone example as soon as that starts to happen as well.
keystone- apache- public- error.log: 2020-11- 05 11:25:52.261790 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query") apache- public- error.log: 2020-11- 05 11:25:52.261793 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines pymysql. err.Operational Error: (2013, 'Lost connection to MySQL server during query') apache- public- error.log: 2020-11- 05 11:25:52.261794 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines apache- public- error.log: 2020-11- 05 11:25:52.261796 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines The above exception was the direct cause of the following exception: apache- public- error.log: 2020-11- 05 11:25:52.261798 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines apache- public- error.log: 2020-11- 05 11:25:52.261799 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines Traceback (most recent call last): apache- public- error.log: 2020-11- 05 11:25:52.261801 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ oslo_db/ sqlalchemy/ engines. py", line 73, in _connect_ ping_listener apache- public- error.log: 2020-11- 05 11:25:52.261803 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines connection. scalar( select( [1])) apache- public- error.log: 2020-11- 05 11:25:52.261805 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 914, in scalar apache- public- error.log: 2020-11- 05 11:25:52.261807 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines return self.execute( object_ , *multiparams, **params).scalar() apache- public- error.log: 2020-11- 05 11:25:52.261809 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 984, in execute apache- public- error.log: 2020-11- 05 11:25:52.261813 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines return meth(self, multiparams, params) apache- public- error.log: 2020-11- 05 11:25:52.261814 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ sql/elements. py", line 293, in _execute_ on_connection apache- public- error.log: 2020-11- 05 11:25:52.261816 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines return connection. _execute_ clauseelement( self, multiparams, params) apache- public- error.log: 2020-11- 05 11:25:52.261818 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 1103, in _execute_ clauseelement apache- public- error.log: 2020-11- 05 11:25:52.261820 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines distilled_params, apache- public- error.log: 2020-11- 05 11:25:52.261822 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 1288, in _execute_context apache- public- error.log: 2020-11- 05 11:25:52.261824 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines e, statement, parameters, cursor, context apache- public- error.log: 2020-11- 05 11:25:52.261826 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 1479, in _handle_ dbapi_exception apache- public- error.log: 2020-11- 05 11:25:52.261828 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines util.raise_ (newraise, with_traceback= exc_info[ 2], from_=e) apache- public- error.log: 2020-11- 05 11:25:52.261830 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ util/compat. py", line 178, in raise_ apache- public- error.log: 2020-11- 05 11:25:52.261832 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines raise exception apache- public- error.log: 2020-11- 05 11:25:52.261834 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ base.py" , line 1248, in _execute_context apache- public- error.log: 2020-11- 05 11:25:52.261836 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines cursor, statement, parameters, context apache- public- error.log: 2020-11- 05 11:25:52.261837 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib64/python3. 6/site- packages/ sqlalchemy/ engine/ default. py", line 590, in do_execute apache- public- error.log: 2020-11- 05 11:25:52.261840 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines cursor. execute( statement, parameters) apache- public- error.log: 2020-11- 05 11:25:52.261841 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ cursors. py", line 170, in execute apache- public- error.log: 2020-11- 05 11:25:52.261843 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines result = self._query(query) apache- public- error.log: 2020-11- 05 11:25:52.261845 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ cursors. py", line 328, in _query apache- public- error.log: 2020-11- 05 11:25:52.261847 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines conn.query(q) apache- public- error.log: 2020-11- 05 11:25:52.261849 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ connections. py", line 517, in query apache- public- error.log: 2020-11- 05 11:25:52.261851 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines self._affected_rows = self._read_ query_result( unbuffered= unbuffered) apache- public- error.log: 2020-11- 05 11:25:52.261853 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ connections. py", line 732, in _read_query_result apache- public- error.log: 2020-11- 05 11:25:52.261855 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines result.read() apache- public- error.log: 2020-11- 05 11:25:52.261856 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ connections. py", line 1075, in read apache- public- error.log: 2020-11- 05 11:25:52.261860 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines first_packet = self.connection ._read_ packet( ) apache- public- error.log: 2020-11- 05 11:25:52.261861 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ connections. py", line 657, in _read_packet apache- public- error.log: 2020-11- 05 11:25:52.261863 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines packet_header = self._read_bytes(4) apache- public- error.log: 2020-11- 05 11:25:52.261865 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines File "/var/lib/ kolla/venv/ lib/python3. 6/site- packages/ pymysql/ connections. py", line 707, in _read_bytes apache- public- error.log: 2020-11- 05 11:25:52.261867 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query") apache- public- error.log: 2020-11- 05 11:25:52.261869 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines oslo_db. exception. DBConnectionErr or: (pymysql. err.Operational Error) (2013, 'Lost connection to MySQL server during query') apache- public- error.log: 2020-11- 05 11:25:52.261871 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines [SQL: SELECT 1] apache- public- error.log: 2020-11- 05 11:25:52.261873 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines (Background on this error at: http:// sqlalche. me/e/e3q8) apache- public- error.log: 2020-11- 05 11:25:52.261877 2020-11-05 11:25:52.259 40 ERROR oslo_db. sqlalchemy. engines \x1b[00
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
keystone-
I see this in mariadb logs
2020-11-05 23:59:55 102752 [Warning] Aborted connection 102752 to db: 'cinder' user: 'cinder' host: 'ctl-os2. hpc.siue. edu' (Got an error reading communication packets) hpc.siue. edu' (Got an error reading communication packets) hpc.siue. edu' (Got an error reading communication packets)
2020-11-05 23:59:57 102756 [Warning] Aborted connection 102756 to db: 'cinder' user: 'cinder' host: 'ctl-os2.
2020-11-05 23:59:59 102760 [Warning] Aborted connection 102760 to db: 'cinder' user: 'cinder' host: 'ctl-os2.
As I'm sure you can imagine once keystone hangs up due to the open file issue the cluster pretty much dies.