I was able to recreate with a set physical servers and here are the logs. For the mysqld logs, I found that they go to the syslog so I've extracted these.
From mysqld logs, there's this exception:
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: caught exception in PC, state dump to stderr follows:
Nov 18 06:36:12 prunes mysqld: pc::Proto{uuid=a8c811b1,start_prim=0,npvo=0,ignore_sb=0,ignore_quorum=0,state=1,last_sent_seq=0,checksum=0,instances=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: ,state_msgs=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: ,current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),pc_view=view((empty)),mtu=2147483647}
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: {v=0,t=1,ut=255,o=4,s=0,sr=0,as=-1,f=0,src=a8c811b1,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=8,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: } 64
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: exception caused by message: {v=0,t=3,ut=255,o=1,s=0,sr=-1,as=0,f=4,src=9ca2cdf9,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=42,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: }
Nov 18 06:36:12 prunes mysqld: state after handling message: evs::proto(evs::proto(a8c811b1, OPERATIONAL, view_id(REG,9ad0acfb,4)), OPERATIONAL) {
Nov 18 06:36:12 prunes mysqld: current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),
Nov 18 06:36:12 prunes mysqld: input_map=evs::input_map: {aru_seq=0,safe_seq=0,node_index=node: {idx=0,range=[1,0],safe_seq=0} node: {idx=1,range=[1,0],safe_seq=0} node: {idx=2,range=[1,0],safe_seq=0} },
Nov 18 06:36:12 prunes mysqld: fifo_seq=10,
Nov 18 06:36:12 prunes mysqld: last_sent=0,
Nov 18 06:36:12 prunes mysqld: known:
Nov 18 06:36:12 prunes mysqld: 9ad0acfb at tcp://10.245.0.209:4567
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=90,}
Nov 18 06:36:12 prunes mysqld: 9ca2cdf9 at tcp://10.245.0.207:4567
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=42,}
Nov 18 06:36:12 prunes mysqld: a8c811b1 at
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=-1,}
Nov 18 06:36:12 prunes mysqld: }2015-11-18 06:36:12 75623 [ERROR] WSREP: failed to open gcomm backend connection: 131: a8c811b1 last prims not consistent (FATAL)
Nov 18 06:36:12 prunes mysqld: #011 at gcomm/src/pc_proto.cpp:is_prim():815
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -131 (State not recoverable)
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'juju_cluster' at 'gcomm://10.245.0.208,10.245.0.207,10.245.0.209': -131 (State not recoverable)
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs connect failed: State not recoverable
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: wsrep::connect(gcomm://10.245.0.208,10.245.0.207,10.245.0.209) failed: 7
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] Aborting
Nov 18 06:36:12 prunes mysqld:
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: Service disconnected.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] WSREP: Some threads may fail to exit.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] Binlog end
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] /usr/sbin/mysqld: Shutdown complete
Nov 18 06:36:13 prunes mysqld:
Nov 18 06:36:13 prunes mysqld_safe: mysqld from pid file /var/lib/mysql/prunes.pid ended
One thing to note is that there are no 'failure to start' for the node with the failure.
I was able to recreate with a set physical servers and here are the logs. For the mysqld logs, I found that they go to the syslog so I've extracted these.
From mysqld logs, there's this exception:
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: caught exception in PC, state dump to stderr follows: uuid=a8c811b1, start_prim= 0,npvo= 0,ignore_ sb=0,ignore_ quorum= 0,state= 1,last_ sent_seq= 0,checksum= 0,instances= prim=1, un=0,last_ seq=3,last_ prim=view_ id(PRIM, 9ad0acfb, 3),to_seq= 18,weight= 1,segment= 0 prim=1, un=0,last_ seq=8,last_ prim=view_ id(PRIM, 9ca2cdf9, 1),to_seq= 7,weight= 1,segment= 0 prim=0, un=0,last_ seq=4294967295, last_prim= view_id( NON_PRIM, 00000000, 0),to_seq= -1,weight= 1,segment= 0 prim=1, un=0,last_ seq=3,last_ prim=view_ id(PRIM, 9ad0acfb, 3),to_seq= 18,weight= 1,segment= 0 prim=1, un=0,last_ seq=8,last_ prim=view_ id(PRIM, 9ca2cdf9, 1),to_seq= 7,weight= 1,segment= 0 prim=0, un=0,last_ seq=4294967295, last_prim= view_id( NON_PRIM, 00000000, 0),to_seq= -1,weight= 1,segment= 0 view=view( view_id( REG,9ad0acfb, 4) memb { view((empty) ),mtu=214748364 7} 1,ut=255, o=4,s=0, sr=0,as= -1,f=0, src=a8c811b1, srcvid= view_id( REG,9ad0acfb, 4),insvid= view_id( UNKNOWN, 00000000, 0),ru=00000000, r=[-1,- 1],fs=8, nl=( 3,ut=255, o=1,s=0, sr=-1,as= 0,f=4,src= 9ca2cdf9, srcvid= view_id( REG,9ad0acfb, 4),insvid= view_id( UNKNOWN, 00000000, 0),ru=00000000, r=[-1,- 1],fs=42, nl=( evs::proto( a8c811b1, OPERATIONAL, view_id( REG,9ad0acfb, 4)), OPERATIONAL) { view=view( view_id( REG,9ad0acfb, 4) memb { evs::input_ map: {aru_seq= 0,safe_ seq=0,node_ index=node: {idx=0, range=[ 1,0],safe_ seq=0} node: {idx=1, range=[ 1,0],safe_ seq=0} node: {idx=2, range=[ 1,0],safe_ seq=0} }, 245.0.209: 4567 0,i=1,fs= 90,} 245.0.207: 4567 0,i=1,fs= 42,} 0,i=1,fs= -1,} pc_proto. cpp:is_ prim(): 815 gcs_core. cpp:gcs_ core_open( ):208: Failed to open backend connection: -131 (State not recoverable) gcs.cpp: gcs_open( ):1379: Failed to open channel 'juju_cluster' at 'gcomm: //10.245. 0.208,10. 245.0.207, 10.245. 0.209': -131 (State not recoverable) connect( gcomm:/ /10.245. 0.208,10. 245.0.207, 10.245. 0.209) failed: 7 mysql/prunes. pid ended
Nov 18 06:36:12 prunes mysqld: pc::Proto{
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,
Nov 18 06:36:12 prunes mysqld: ,state_msgs=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ad0acfb,
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ca2cdf9,
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#011a8c811b1,
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: ,current_
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),pc_view=
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: {v=0,t=
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: } 64
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: exception caused by message: {v=0,t=
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: }
Nov 18 06:36:12 prunes mysqld: state after handling message: evs::proto(
Nov 18 06:36:12 prunes mysqld: current_
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),
Nov 18 06:36:12 prunes mysqld: input_map=
Nov 18 06:36:12 prunes mysqld: fifo_seq=10,
Nov 18 06:36:12 prunes mysqld: last_sent=0,
Nov 18 06:36:12 prunes mysqld: known:
Nov 18 06:36:12 prunes mysqld: 9ad0acfb at tcp://10.
Nov 18 06:36:12 prunes mysqld: {o=1,s=
Nov 18 06:36:12 prunes mysqld: 9ca2cdf9 at tcp://10.
Nov 18 06:36:12 prunes mysqld: {o=1,s=
Nov 18 06:36:12 prunes mysqld: a8c811b1 at
Nov 18 06:36:12 prunes mysqld: {o=1,s=
Nov 18 06:36:12 prunes mysqld: }2015-11-18 06:36:12 75623 [ERROR] WSREP: failed to open gcomm backend connection: 131: a8c811b1 last prims not consistent (FATAL)
Nov 18 06:36:12 prunes mysqld: #011 at gcomm/src/
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs connect failed: State not recoverable
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: wsrep::
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] Aborting
Nov 18 06:36:12 prunes mysqld:
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: Service disconnected.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] WSREP: Some threads may fail to exit.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] Binlog end
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] /usr/sbin/mysqld: Shutdown complete
Nov 18 06:36:13 prunes mysqld:
Nov 18 06:36:13 prunes mysqld_safe: mysqld from pid file /var/lib/
One thing to note is that there are no 'failure to start' for the node with the failure.