Silent abort (crash) at gcs/src/gcs_core.cpp:1152
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Description:
After some period of stable work Percona XtraDB Cluster nodes starts crashing and can't join cluster (no matter IST/SST).
The only way to restore cluster is to stop all nodes and rebootstrap.
Steps to reproduce (reproducibility - 100%):
1. Bootstrap cluster
2. Wait for 15-20 days
3. Some XtraDB node will crash
4. Try to join cluster (i.e. systemctl start mysql).
Actual results:
Crash and inability to join cluster.
2016-02-05 10:42:27 6383 [Warning] WSREP: 1.0 (mysql-rw0): State transfer to 0.0 (mysql-rw1) failed: -12 (Cannot allocate memory)
2016-02-05 10:42:27 6383 [ERROR] WSREP: gcs/src/
Expected results:
JOINER->SYNCED
Additional info:
I've tried to enable core dumping but it seems that galera disables it in sources so I've tried to run mysqld manually (i.e. without mysqld_safe wrapper) with gdb attached to catch fault and get bt.
# rpm -qa | grep Percona
Percona-
Percona-
Percona-
Percona-
Percona-
Percona-
Percona-
Percona-
Percona-
Percona-
# cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)
# hostnamectl
Static hostname: mysql-rw2
Chassis: container
Virtualization: lxc-libvirt
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:
Kernel: Linux 3.10.0-
Architecture: x86-64
Changed in percona-xtradb-cluster: | |
status: | New → Fix Committed |
Changed in percona-xtradb-cluster: | |
milestone: | none → 5.6.29-25.15 |
Changed in percona-xtradb-cluster: | |
status: | Fix Committed → Fix Released |
I see there is error because memory allocation fails during SST but at first level not sure why it is so.
While we look at this check if you can findout cause for memory allocation failure.