Comment 13 for bug 1761536

Revision history for this message
David Ames (thedac) wrote :

Root cause of the instance boot failures is app armor on the neutron-gateway blocking neutron agents from creating temporary directories:

1vlRG/" pid=1412869 comm="neutron-dhcp-ag" requested_mask="c" denied_mask="c" fsuid=115 ouid=115
[76035.437502] audit: type=1400 audit(1524677252.781:36019): apparmor="DENIED" operation="mkdir" profile="/usr/bin/neutron-dhcp-agent" name="/tmp/tmp4AIVtB/" pid=1412869 comm="neutron-dhcp-ag" requested_mask="c" denied_mask="c" fsuid=115 ouid=115

Both the dhcp-agent and the l3-agent both show the problem.

Assigning this bug to neutron-gateway for the app armor bug

A secondary issue that is as yet not root caused is DBConnection errors from all the API charms connecting to percona cluster. After changing the neutron-gateway aa-profile-mode to complain we saw these errors much less frequently but they did not go away entirely.

2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines [req-dcf87632-8b6e-4071-a336-64b1442dc7fe - - - - -] Database connection was found disconnected; reconnecting: DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1']
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/oslo_db/sqlalchemy/engines.py", line 73, in _connect_ping_listener
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines connection.scalar(select([1]))
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 877, in scalar
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines return self.execute(object, *multiparams, **params).scalar()
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 945, in execute
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines return meth(self, multiparams, params)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines return connection._execute_clauseelement(self, multiparams, params)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines compiled_sql, distilled_params
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines context)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1398, in _handle_dbapi_exception
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines util.raise_from_cause(newraise, exc_info)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines reraise(type(exception), exception, tb=exc_tb, cause=cause)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines context)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 470, in do_execute
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines cursor.execute(statement, parameters)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 165, in execute
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines result = self._query(query)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 321, in _query
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines conn.query(q)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 860, in query
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1061, in _read_query_result
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines result.read()
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1349, in read
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines first_packet = self.connection._read_packet()
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 991, in _read_packet
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines packet_header = self._read_bytes(4)
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1037, in _read_bytes
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1']
2018-04-25 17:49:33.562 617800 ERROR oslo_db.sqlalchemy.engines

Next steps:
Please redeploy with neutron-gatway aa-profile-mode=complain while we fix the apparmor profile bug. The current deploy has been tainted by all of our debugging attempts. Though we are able to launch instances it no longer represents a valid test.