When controller-1 is rebooted at 15:01:52, the nova-api on controller-0 loses access to mariadb, which prevents any of the nova related CLIs (e.g. server list) from running. For example:
{"log":"2019-05-21 15:02:18.491 1 ERROR oslo_db.sqlalchemy.exc_filters [req-2c25a5f7-22c7-4736-8c06-5d8c10e5c159 53347a9c6a7340f5be77da8cd6f2b3d8 efe5dd8fd94c428fa1eac07d31dc4184 - default default] DBAPIError exception wrapped from (pymysql.err.InternalError) (1047, u'WSREP has not yet prepared node for application use') [SQL: u'SELECT instance_mappings.created_at AS instance_mappings_created_at, instance_mappings.updated_at AS instance_mappings_updated_at, instance_mappings.id AS instance_mappings_id, instance_mappings.instance_uuid AS instance_mappings_instance_uuid, instance_mappings.cell_id AS instance_mappings_cell_id, instance_mappings.project_id AS instance_mappings_project_id, instance_mappings.user_id AS instance_mappings_user_id, instance_mappings.queued_for_delete AS instance_mappings_queued_for_delete, cell_mappings_1.created_at AS cell_mappings_1_created_at, cell_mappings_1.updated_at AS cell_mappings_1_updated_at, cell_mappings_1.id AS cell_mappings_1_id, cell_mappings_1.uuid AS cell_mappings_1_uuid, cell_mappings_1.name AS cell_mappings_1_name, cell_mappings_1.transport_url AS cell_mappings_1_transport_url, cell_mappings_1.database_connection AS cell_mappings_1_database_connection, cell_mappings_1.disabled AS cell_mappings_1_disabled \\nFROM instance_mappings LEFT OUTER JOIN cell_mappings AS cell_mappings_1 ON instance_mappings.cell_id = cell_mappings_1.id \\nWHERE instance_mappings.instance_uuid = %(instance_uuid_1)s \\n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'instance_uuid_1': u'6e8a1aea-b316-4c5d-af48-5d32ffe0a443'}] (Background on this error at: http://sqlalche.me/e/2j85): InternalError: (1047, u'WSREP has not yet prepared node for application use')\n","stream":"stdout","time":"2019-05-21T15:02:18.492940446Z"}
The mariadb is running with a single node and is non-Primary, so is not functional:
{"log":"2019-05-21 15:01:58,543 - OpenStack-Helm Mariadb - INFO - 2019-05-21 15:01:58 139633481164544 [Note] WSREP: New cluster view: global state: 09649517-7ba1-11e9-8a1c-6e7cd8f18acb:42492, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3\n","stream":"stderr","time":"2019-05-21T15:01:58.543232866Z"}
There are other issues after that, but the trigger was the mariadb problem. I can see this log constantly coming out on both controllers:
2019-05-21T15:00:13.530 controller-0 OCF_dbmon(dbmon)[2903462]: info INFO: Unable to get mariadb password. Exiting.
When controller-1 is rebooted at 15:01:52, the nova-api on controller-0 loses access to mariadb, which prevents any of the nova related CLIs (e.g. server list) from running. For example: sqlalchemy. exc_filters [req-2c25a5f7- 22c7-4736- 8c06-5d8c10e5c1 59 53347a9c6a7340f 5be77da8cd6f2b3 d8 efe5dd8fd94c428 fa1eac07d31dc41 84 - default default] DBAPIError exception wrapped from (pymysql. err.InternalErr or) (1047, u'WSREP has not yet prepared node for application use') [SQL: u'SELECT instance_ mappings. created_ at AS instance_ mappings_ created_ at, instance_ mappings. updated_ at AS instance_ mappings_ updated_ at, instance_ mappings. id AS instance_ mappings_ id, instance_ mappings. instance_ uuid AS instance_ mappings_ instance_ uuid, instance_ mappings. cell_id AS instance_ mappings_ cell_id, instance_ mappings. project_ id AS instance_ mappings_ project_ id, instance_ mappings. user_id AS instance_ mappings_ user_id, instance_ mappings. queued_ for_delete AS instance_ mappings_ queued_ for_delete, cell_mappings_ 1.created_ at AS cell_mappings_ 1_created_ at, cell_mappings_ 1.updated_ at AS cell_mappings_ 1_updated_ at, cell_mappings_1.id AS cell_mappings_1_id, cell_mappings_ 1.uuid AS cell_mappings_ 1_uuid, cell_mappings_ 1.name AS cell_mappings_ 1_name, cell_mappings_ 1.transport_ url AS cell_mappings_ 1_transport_ url, cell_mappings_ 1.database_ connection AS cell_mappings_ 1_database_ connection, cell_mappings_ 1.disabled AS cell_mappings_ 1_disabled \\nFROM instance_mappings LEFT OUTER JOIN cell_mappings AS cell_mappings_1 ON instance_ mappings. cell_id = cell_mappings_1.id \\nWHERE instance_ mappings. instance_ uuid = %(instance_uuid_1)s \\n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'instance_uuid_1': u'6e8a1aea- b316-4c5d- af48-5d32ffe0a4 43'}] (Background on this error at: http:// sqlalche. me/e/2j85): InternalError: (1047, u'WSREP has not yet prepared node for application use')\n" ,"stream" :"stdout" ,"time" :"2019- 05-21T15: 02:18.492940446 Z"}
{"log":"2019-05-21 15:02:18.491 1 ERROR oslo_db.
The mariadb is running with a single node and is non-Primary, so is not functional: 7ba1-11e9- 8a1c-6e7cd8f18a cb:42492, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3\n","stream" :"stderr" ,"time" :"2019- 05-21T15: 01:58.543232866 Z"}
{"log":"2019-05-21 15:01:58,543 - OpenStack-Helm Mariadb - INFO - 2019-05-21 15:01:58 139633481164544 [Note] WSREP: New cluster view: global state: 09649517-
There are other issues after that, but the trigger was the mariadb problem. I can see this log constantly coming out on both controllers: 21T15:00: 13.530 controller-0 OCF_dbmon( dbmon)[ 2903462] : info INFO: Unable to get mariadb password. Exiting.
2019-05-
This is due to a bug in the dbmon ocf script: /bugs.launchpad .net/starlingx/ +bug/1826891
https:/
I am marking this LP as a duplicate of the dbmon issue (LP1826891). Once that issue is fixed, please re-test.