With 500 compute host and 3500 instances could see
nova-api,schedular and conductor logs with ~3500 DBConnectionError .
From debuggability point of view this need to be handled.
root@photon-machine [ /var/log/vio-support_8-6-2019_8-39-10 ]# grep -r CR.CR_SERVER_LOST | wc
3441 48174 877412
Traceback:
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines [req-65e0c9fe-ffc3-470b-b5f9-a82089a43ee8 3c34bf881a264fd1bc362598259ea813 b34079bf2ee9448780d0630f98759a76 - default default] Database connection was found disconnected; reconnecting: DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py", line 73, in _connect_ping_listener
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines connection.scalar(select([1]))
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 912, in scalar
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines return self.execute(object_, *multiparams, **params).scalar()
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 980, in execute
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines return meth(self, multiparams, params)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 273, in _execute_on_connection
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines return connection._execute_clauseelement(self, multiparams, params)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1099, in _execute_clauseelement
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines distilled_params,
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1240, in _execute_context
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines e, statement, parameters, cursor, context
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1456, in _handle_dbapi_exception
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines util.raise_from_cause(newraise, exc_info)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines reraise(type(exception), exception, tb=exc_tb, cause=cause)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines cursor, statement, parameters, context
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines cursor.execute(statement, parameters)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines result = self._query(query)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines conn.query(q)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 517, in query
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 732, in _read_query_result
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines result.read()
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1075, in read
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines first_packet = self.connection._read_packet()
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 657, in _read_packet
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines packet_header = self._read_bytes(4)
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 707, in _read_bytes
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
2019-08-05T08:00:11Z 2019-08-05 08:00:11.282 13 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)
Have you tried to set connection_ recycle_ time on client side(nova_api, scheduler, conductor) to value lower then wait_timeout on mariadb side?