Comment 6 for bug 1501947

Revision history for this message
Larry Michel (lmic) wrote :

I was able to recreate with a set physical servers and here are the logs. For the mysqld logs, I found that they go to the syslog so I've extracted these.

From mysqld logs, there's this exception:

Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: caught exception in PC, state dump to stderr follows:
Nov 18 06:36:12 prunes mysqld: pc::Proto{uuid=a8c811b1,start_prim=0,npvo=0,ignore_sb=0,ignore_quorum=0,state=1,last_sent_seq=0,checksum=0,instances=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: ,state_msgs=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: ,current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),pc_view=view((empty)),mtu=2147483647}
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: {v=0,t=1,ut=255,o=4,s=0,sr=0,as=-1,f=0,src=a8c811b1,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=8,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: } 64
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: exception caused by message: {v=0,t=3,ut=255,o=1,s=0,sr=-1,as=0,f=4,src=9ca2cdf9,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=42,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: }
Nov 18 06:36:12 prunes mysqld: state after handling message: evs::proto(evs::proto(a8c811b1, OPERATIONAL, view_id(REG,9ad0acfb,4)), OPERATIONAL) {
Nov 18 06:36:12 prunes mysqld: current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),
Nov 18 06:36:12 prunes mysqld: input_map=evs::input_map: {aru_seq=0,safe_seq=0,node_index=node: {idx=0,range=[1,0],safe_seq=0} node: {idx=1,range=[1,0],safe_seq=0} node: {idx=2,range=[1,0],safe_seq=0} },
Nov 18 06:36:12 prunes mysqld: fifo_seq=10,
Nov 18 06:36:12 prunes mysqld: last_sent=0,
Nov 18 06:36:12 prunes mysqld: known:
Nov 18 06:36:12 prunes mysqld: 9ad0acfb at tcp://10.245.0.209:4567
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=90,}
Nov 18 06:36:12 prunes mysqld: 9ca2cdf9 at tcp://10.245.0.207:4567
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=42,}
Nov 18 06:36:12 prunes mysqld: a8c811b1 at
Nov 18 06:36:12 prunes mysqld: {o=1,s=0,i=1,fs=-1,}
Nov 18 06:36:12 prunes mysqld: }2015-11-18 06:36:12 75623 [ERROR] WSREP: failed to open gcomm backend connection: 131: a8c811b1 last prims not consistent (FATAL)
Nov 18 06:36:12 prunes mysqld: #011 at gcomm/src/pc_proto.cpp:is_prim():815
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -131 (State not recoverable)
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'juju_cluster' at 'gcomm://10.245.0.208,10.245.0.207,10.245.0.209': -131 (State not recoverable)
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: gcs connect failed: State not recoverable
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: wsrep::connect(gcomm://10.245.0.208,10.245.0.207,10.245.0.209) failed: 7
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] Aborting
Nov 18 06:36:12 prunes mysqld:
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: Service disconnected.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] WSREP: Some threads may fail to exit.
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] Binlog end
Nov 18 06:36:13 prunes mysqld: 2015-11-18 06:36:13 75623 [Note] /usr/sbin/mysqld: Shutdown complete
Nov 18 06:36:13 prunes mysqld:
Nov 18 06:36:13 prunes mysqld_safe: mysqld from pid file /var/lib/mysql/prunes.pid ended

One thing to note is that there are no 'failure to start' for the node with the failure.