the mysql didn't crashed as described in the issue, it got split-brain:
./ocf-mysql-wss.log:994:2016-09-05T01:07:20.281618+00:00 err: ERROR: p_mysqld: check_if_galera_pc(): But I'm running a new cluster, PID:11395, this is a split-brain!
and pacemaker gracefully stoppped mysqld on the node-1:
2016-09-05T01:07:20.511963+00:00 err: 2016-09-05 01:07:20 11395 [Note] /usr/sbin/mysqld: Normal shutdown
several seconds later it started again:
2016-09-05T01:07:37.569858+00:00 err: 2016-09-05 01:07:37 0 [Note] /usr/sbin/mysqld (mysqld 5.6.30-0~u14.04+mos1) starting as process 13109 ...
several seconds later Galera cluster ready:
2016-09-05T01:07:41.697481+00:00 err: 2016-09-05 01:07:41 13262 [Note] WSREP: New cluster view: global state: 9f259a0e-72ff-11e6-aeb9-fab6b3c5476e:4265, view# 5: Primary, number of nodes: 3, my index: 0, protocol version 3
2016-09-05T01:07:41.697481+00:00 err: 2016-09-05 01:07:41 13262 [Note] WSREP: SST complete, seqno: 4265
so, mysqld on the node-1 was unreachable 21 seconds from 01:07:20 to 01:07:41 - right at this window the murano-client did his request and got an error:
2016-09-05T01:07:33.274404+00:00 info: HTTPInternalServerError: {"explanation": "The server has either erred or is incapable of performing the requested operation.", "code": 500, "error": {"message": "
(_mysql_exceptions.OperationalError) (2013, \"Lost connection to MySQL server at 'reading initial communication packet', system error: 0\") [SQL: u'SELECT 1']"
the murano-client in the ostf tests should do retries.
the mysql didn't crashed as described in the issue, it got split-brain:
./ocf-mysql- wss.log: 994:2016- 09-05T01: 07:20.281618+ 00:00 err: ERROR: p_mysqld: check_if_ galera_ pc(): But I'm running a new cluster, PID:11395, this is a split-brain!
and pacemaker gracefully stoppped mysqld on the node-1:
2016-09- 05T01:07: 20.511963+ 00:00 err: 2016-09-05 01:07:20 11395 [Note] /usr/sbin/mysqld: Normal shutdown
several seconds later it started again:
2016-09- 05T01:07: 37.569858+ 00:00 err: 2016-09-05 01:07:37 0 [Note] /usr/sbin/mysqld (mysqld 5.6.30- 0~u14.04+ mos1) starting as process 13109 ...
several seconds later Galera cluster ready:
2016-09- 05T01:07: 41.697481+ 00:00 err: 2016-09-05 01:07:41 13262 [Note] WSREP: New cluster view: global state: 9f259a0e- 72ff-11e6- aeb9-fab6b3c547 6e:4265, view# 5: Primary, number of nodes: 3, my index: 0, protocol version 3 05T01:07: 41.697481+ 00:00 err: 2016-09-05 01:07:41 13262 [Note] WSREP: SST complete, seqno: 4265
2016-09-
so, mysqld on the node-1 was unreachable 21 seconds from 01:07:20 to 01:07:41 - right at this window the murano-client did his request and got an error:
2016-09- 05T01:07: 33.274404+ 00:00 info: HTTPInternalSer verError: {"explanation": "The server has either erred or is incapable of performing the requested operation.", "code": 500, "error": {"message": " exceptions. OperationalErro r) (2013, \"Lost connection to MySQL server at 'reading initial communication packet', system error: 0\") [SQL: u'SELECT 1']"
(_mysql_
the murano-client in the ostf tests should do retries.