3-node debian 5.6.cluster crashes/freezes
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Invalid
|
Undecided
|
Unassigned | |||
5.6 |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Hiya,
3 nodes; new setup; latest packages
percona-toolkit 2.2.7
percona-xtrabackup 2.1.8-733-1.wheezy
percona-
percona-
percona-
percona-
percona-
percona-
node1 - started with /etc/init.d/mysql bootstrap-pxc
node2 - mysqld totally crashed and couldn't be restarted
node3 - mysqld " "
I've included the log from node3.
I read 1million rows of data into each node (1..3) concurrently without a problem. So cluster appeared to be working. We also restored a 80G mysql dump which appeared to work but now I wonder...
Once I noticed it hung (I'm testing it out), I actually resorted to kill -9 the mysql process on node1 (started with bootstrap-pxc) as both node2/3 were down and couldn't be restarted and I couldn't connect to node1 with mysql cli. Once I restarted node1, then node2/3, everything seems ok but....
I hope you can fix this because it is a bad crash and doesn't instill confidence. Let me know if I can give you anything else to provide forensics.
Below are the cnf from the bootstrap node and node3. Strangely, no log emitted on the bootstrap node1 or node 2 but node3 shows a bad exception get thrown (apparently).
-Chris
>>>> Node1.cnf <<<<<
[mysqld]
datadir=
user=mysql
# Path to Galera library
wsrep_provider=
# Cluster connection URL contains the IPs of node#1, node#2 and node#3
wsrep_cluster_
#wsrep_
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_
# Node #1 address
wsrep_node_
# SST method
wsrep_sst_
# Cluster name
wsrep_cluster_
# Authentication for SST method
wsrep_sst_
>>>> Node3.cnf <<<<
[mysqld]
datadir=
user=mysql
# Path to Galera library
wsrep_provider=
# Cluster connection URL contains the IPs of node#1, node#2 and node#3
wsrep_cluster_
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_
# Node #3 address
wsrep_node_
# SST method
wsrep_sst_
# Cluster name
wsrep_cluster_
# Authentication for SST method
wsrep_sst_
I doubled the memory on the cluster to 2GB per node and it happened again. This time, it brought down all 3 nodes.