From http://logs.openstack.org/65/544065/2/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/da0522d/logs
The job failed because the deployment process got a 500 from keystone when registering endpoints. In digging into this failure, it was because the mysql connection went away. In looking into why mysql went away, it appears that the mysql monitor failed and mysql was restarted.
http://logs.openstack.org/65/544065/2/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/da0522d/logs/subnode-2/var/log/pacemaker/bundles/galera-bundle-0/pacemaker.log.txt.gz
Feb 20 18:37:24 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: log_finished: finished - rsc:galera action:promote call_id:68 pid:576 exit-code:0 exec-time:10084ms queue-time:1ms
Feb 20 18:50:18 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: warning: child_timeout_callback: galera_monitor_10000 process (PID 5447) timed out
Feb 20 18:50:18 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: warning: operation_finished: galera_monitor_10000:5447 - timed out after 30000ms
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: lrmd_remote_client_msg: Client disconnect detected in tls msg dispatcher.
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: cancel_recurring_action: Cancelling ocf operation galera_monitor_10000
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: error: crm_send_tls: Connection terminated rc = -53
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: error: crm_remote_send: Failed to send remote msg, rc = -53
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: error: lrmd_tls_send_msg: Failed to send remote lrmd tls msg, rc = -53
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: warning: send_client_notify: Notification of client remote-lrmd-centos-7-inap-mtl01-0002637151:3123/1cd9b0b5-3976-483d-ba4b-4b405ac6c63f failed
Feb 20 18:50:40 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: notice: lrmd_remote_client_destroy: LRMD client disconnecting remote client - name: remote-lrmd-centos-7-inap-mtl01-0002637151:3123 id: 1cd9b0b5-3976-483d-ba4b-4b405ac6c63f
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: crm_log_init: Changed active directory to /var/lib/pacemaker/cores
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: lrmd
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3123.
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: notice: bind_and_listen: Listening on address ::
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_ro
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_rw
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_shm
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: attrd
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: stonith-ng
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: qb_ipcs_us_publish: server name: crmd
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: main: Starting
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: crm_remote_accept: New remote connection from ::ffff:192.168.24.3
Feb 20 18:50:52 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: notice: lrmd_remote_listen: LRMD client connection established. 0x56272f160d70 id: 65e0cf11-e2ce-4e59-ad03-36accac427e7
Feb 20 18:50:53 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: process_lrmd_get_rsc_info: Resource 'galera' not found (0 active resources)
Feb 20 18:50:53 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: process_lrmd_get_rsc_info: Resource 'galera:0' not found (0 active resources)
Feb 20 18:50:53 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: process_lrmd_rsc_register: Added 'galera' to the rsc list (1 active resources)
Feb 20 18:50:54 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: notice: check_sbd_timeout: Watchdog may be enabled but stonith-watchdog-timeout is disabled: (null)
Feb 20 18:50:54 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: check_sbd_timeout: Watchdog functionality is consistent: (null) delay exceeds timeout of -1ms
Feb 20 18:50:55 [13] centos-7-inap-mtl01-0002637151 pacemaker_remoted: info: log_execute: executing - rsc:galera action:start call_id:13
http://logs.openstack.org/65/544065/2/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/da0522d/logs/subnode-2/var/log/containers/keystone/keystone.log.txt.gz#_2018-02-20_18_50_18_174
2018-02-20 18:50:18.174 23 ERROR keystone.common.wsgi [req-debd0ebd-196f-441a-822a-62d67fa67d6f - - - - -] (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query'): DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
http://logs.openstack.org/65/544065/2/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/da0522d/logs/postci.txt.gz
"Error: Execution of '/usr/bin/openstack endpoint create --format shell f428d84729514400844ee45ba9b2bca0 public http://192.168.24.7:8776/v2/%(tenant_id)s --region regionOne' returned 1: An unexpected error prevented the server from fulfilling your request. (HTTP 500) (Request-ID: req-debd0ebd-196f-441a-822a-62d67fa67d6f)",
A couple of drive-by thoughts on this one. 7-inap- mtl01-000263715 1 lrmd: warning: child_timeout_ callback: haproxy- bundle- docker- 0_monitor_ 60000 process (PID 86216) timed out 7-inap- mtl01-000263715 1 lrmd: warning: operation_finished: haproxy- bundle- docker- 0_monitor_ 60000:86216 - timed out after 20000ms 7-inap- mtl01-000263715 1 lrmd: warning: child_timeout_ callback: rabbitmq- bundle- docker- 0_monitor_ 60000 process (PID 86274) timed out 7-inap- mtl01-000263715 1 lrmd: warning: child_timeout_ callback: galera- bundle- docker- 0_monitor_ 60000 process (PID 86434) timed out 7-inap- mtl01-000263715 1 lrmd: warning: child_timeout_ callback: redis-bundle- docker- 0_monitor_ 60000 process (PID 86625) timed out 7-inap- mtl01-000263715 1 lrmd: warning: operation_finished: galera- bundle- docker- 0_monitor_ 60000:86434 - timed out after 20000ms 7-inap- mtl01-000263715 1 crmd: error: process_lrm_event: Result of monitor operation for galera on galera-bundle-0: Timed Out | call=139 key=galera_ monitor_ 10000 timeout=30000ms 7-inap- mtl01-000263715 1 lrmd: warning: operation_finished: redis-bundle- docker- 0_monitor_ 60000:86625 - timed out after 20000ms 7-inap- mtl01-000263715 1 lrmd: warning: operation_finished: rabbitmq- bundle- docker- 0_monitor_ 60000:86274 - timed out after 20000ms 7-inap- mtl01-000263715 1 crmd: error: process_lrm_event: Result of monitor operation for haproxy- bundle- docker- 0 on centos- 7-inap- mtl01-000263715 1: Timed Out | call=52 key=haproxy- bundle- docker- 0_monitor_ 60000 timeout=20000ms 7-inap- mtl01-000263715 1 crmd: error: process_lrm_event: Result of monitor operation for galera- bundle- docker- 0 on centos- 7-inap- mtl01-000263715 1: Timed Out | call=22 key=galera- bundle- docker- 0_monitor_ 60000 timeout=20000ms 7-inap- mtl01-000263715 1 crmd: error: process_lrm_event: Result of monitor operation for redis-bundle- docker- 0 on centos- 7-inap- mtl01-000263715 1: Timed Out | call=33 key=redis- bundle- docker- 0_monitor_ 60000 timeout=20000ms 7-inap- mtl01-000263715 1 crmd: error: process_lrm_event: Result of monitor operation for rabbitmq- bundle- docker- 0 on centos- 7-inap- mtl01-000263715 1: Timed Out | call=11 key=rabbitmq- bundle- docker- 0_monitor_ 60000 timeout=20000ms 7-inap- mtl01-000263715 1 crmd: info: throttle_ check_threshold s: Moderate CPU load detected: 12.020000 7-inap- mtl01-000263715 1 crmd: info: throttle_ send_command: New throttle mode: 0010 (was 0000)
It seems that the issue is more around slowness in general as pacemaker times out on the monitoring of all its services:
Feb 20 18:50:14 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15128] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15131] centos-
Feb 20 18:50:38 [15131] centos-
What is interesting though is that it times out monitoring on all docker services and not the (IPs)...