Cluster failed because of "Writeset deserialization failed: Writeset checksum failed: 22 (Invalid argument)" error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Expired
|
Undecided
|
Unassigned |
Bug Description
A few days ago the entire cluster, composed of three nodes in the same LAN, crashed (all nodes at the same time). The error on node db1 was this:
2015-09-16 14:08:25 11591 [ERROR] WSREP: RecordSet checksum does not match:
computed: 34072646 a1931ae0 ccb8a464 934ec6ae
found: f9f0f9c1 998b49f6 88a0fbf2 07691543: 22 (Invalid argument)
at galerautils/
2015-09-16 14:08:25 11591 [ERROR] WSREP: Writeset deserialization failed: Writeset checksum failed: 22 (Invalid arg
ument)
at galera/
at galera/
WS flags: 1
Trx proto: 3
Trx source: 2bd1b8cd-
Trx conn_id: 2898024
Trx trx_id: 957996414
Trx last_seen: 460969023
2015-09-16 14:08:25 11591 [ERROR] WSREP: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
2015-09-16 14:08:25 11591 [ERROR] WSREP: unknown connection failure
2015-09-16 14:08:25 11591 [ERROR] WSREP: FSM: no such a transition REPLICATING -> ROLLED_BACK
14:08:25 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https:/
key_buffer_
read_buffer_
max_used_
max_threads=153
thread_count=12
connection_count=10
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x73df000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f8439cece50 thread_stack 0x30000
/usr/sbin/
/usr/sbin/
/lib/x86_
/lib/x86_
/lib/x86_
/usr/lib/
/usr/lib/
/usr/lib/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/usr/sbin/
/lib/x86_
/lib/x86_
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f84fd94c920): is an invalid pointer
Connection ID (thread ID): 2898024
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://
in the manual which will help you identify the cause of the crash.
150916 14:08:27 mysqld_safe Number of processes running now: 0
150916 14:08:27 mysqld_safe WSREP: not restarting wsrep node automatically
150916 14:08:27 mysqld_safe mysqld from pid file /var/run/
These are the logs on db2 server:
2015-09-16 14:08:25 22475 [ERROR] WSREP: RecordSet checksum does not match:
computed: 34072646 a1931ae0 ccb8a464 934ec6ae
found: f9f0f9c1 998b49f6 88a0fbf2 07691543: 22 (Invalid argument)
at galerautils/
2015-09-16 14:08:25 22475 [ERROR] WSREP: Writeset deserialization failed: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
WS flags: 0
Trx proto: 3
Trx source: 00000000-
Trx conn_id: 184467440737095
Trx trx_id: 184467440737095
Trx last_seen: -1
2015-09-16 14:08:25 22475 [ERROR] WSREP: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
2015-09-16 14:08:25 22475 [Note] WSREP: applier thread exiting (code:7)
2015-09-16 14:08:25 22475 [ERROR] WSREP: node consistency compromised, aborting
2015-09-16 14:08:25 22475 [Note] WSREP: starting shutdown
2015-09-16 14:08:25 22475 [Note] /usr/sbin/mysqld: Normal shutdown
2015-09-16 14:08:25 22475 [Note] WSREP: Stop replication
2015-09-16 14:08:25 22475 [Note] WSREP: Closing send monitor...
2015-09-16 14:08:25 22475 [Note] WSREP: Closed send monitor.
2015-09-16 14:08:25 22475 [Note] WSREP: gcomm: terminating thread
2015-09-16 14:08:25 22475 [Note] WSREP: gcomm: joining thread
2015-09-16 14:08:25 22475 [Note] WSREP: gcomm: closing backend
2015-09-16 14:08:25 22475 [Note] WSREP: declaring 2bd1b8cd at tcp://10.
2015-09-16 14:08:25 22475 [Note] WSREP: forgetting 889a5742 (tcp://
And these are db3 logs:
2015-09-16 14:08:25 18460 [ERROR] WSREP: RecordSet checksum does not match:
computed: 34072646 a1931ae0 ccb8a464 934ec6ae
found: f9f0f9c1 998b49f6 88a0fbf2 07691543: 22 (Invalid argument)
at galerautils/
2015-09-16 14:08:25 18460 [ERROR] WSREP: Writeset deserialization failed: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
WS flags: 0
Trx proto: 3
Trx source: 00000000-
Trx conn_id: 184467440737095
Trx trx_id: 184467440737095
Trx last_seen: -1
2015-09-16 14:08:25 18460 [ERROR] WSREP: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
2015-09-16 14:08:25 18460 [Note] WSREP: applier thread exiting (code:7)
2015-09-16 14:08:25 18460 [ERROR] WSREP: node consistency compromised, aborting
2015-09-16 14:08:25 18460 [Note] WSREP: starting shutdown
2015-09-16 14:08:25 18460 [Note] /usr/sbin/mysqld: Normal shutdown
2015-09-16 14:08:25 18460 [Note] WSREP: Stop replication
2015-09-16 14:08:25 18460 [Note] WSREP: Closing send monitor...
2015-09-16 14:08:25 18460 [Note] WSREP: Closed send monitor.
2015-09-16 14:08:25 18460 [Note] WSREP: gcomm: terminating thread
2015-09-16 14:08:25 18460 [Note] WSREP: gcomm: joining thread
2015-09-16 14:08:25 18460 [Note] WSREP: gcomm: closing backend
This is the version on all three nodes:
# /usr/sbin/mysqld --version
/usr/sbin/mysqld Ver 5.6.22-72.0-56 for debian-linux-gnu on x86_64 (Percona XtraDB Cluster (GPL), Release rel72.0, Revision 978, WSREP version 25.8, wsrep_25.8.r4150)
What could cause this problem?
description: | updated |
This is also seen on 5.6.25 PXC
2015-10-02 20:21:40 19051 [Note] WSREP: Receiving IST: 348517 writesets, seqnos 5313805227- 5314153744 mysqld/ mysqld. sock' port: 3306 Percona XtraDB Cluster (GPL), Release rel73.1, Revision 011f1e6, WSREP version 25.12, wsrep_25.12 src/gu_ rset.cpp: checksum( ):392 src/write_ set_ng. hpp:checksum_ fin():813 src/trx_ handle. cpp:unserialize ():268 0000-0000- 0000-0000000000 00 51615 51615 src/write_ set_ng. hpp:checksum_ fin():813 src/trx_ handle. cpp:unserialize ():268 src/ist. cpp:recv( ):432 id(NON_ PRIM,b5e85121, 34) memb {
2015-10-02 20:21:40 19051 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.25-73.1-56' socket: '/var/run/
2015-10-02 20:21:40 19051 [ERROR] WSREP: RecordSet checksum does not match:
computed: 9beed926 44854eec 57a6b8ea b0d7b3b9
found: 2ef0c835 d1d6c397 324964df d0879aa1: 22 (Invalid argument)
at galerautils/
2015-10-02 20:21:40 19051 [ERROR] WSREP: Writeset deserialization failed: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
WS flags: 0
Trx proto: 3
Trx source: 00000000-
Trx conn_id: 184467440737095
Trx trx_id: 184467440737095
Trx last_seen: -1
2015-10-02 20:21:40 19051 [ERROR] WSREP: got exception while reading ist stream: Writeset checksum failed: 22 (Invalid argument)
at galera/
at galera/
2015-10-02 20:21:40 19051 [ERROR] WSREP: IST didn't contain all write sets, expected last: 5314153744 last received: 5313805242
2015-10-02 20:21:40 19051 [ERROR] WSREP: receiving IST failed, node restart required: IST receiver reported error: 71 (Protocol error)
at galera/
2015-10-02 20:21:40 19051 [Note] WSREP: Closing send monitor...
2015-10-02 20:21:40 19051 [Note] WSREP: Closed send monitor.
2015-10-02 20:21:40 19051 [Note] WSREP: gcomm: terminating thread
2015-10-02 20:21:40 19051 [Note] WSREP: gcomm: joining thread
2015-10-02 20:21:40 19051 [Note] WSREP: gcomm: closing backend
2015-10-02 20:21:40 19051 [Note] WSREP: view(view_
d0f8b70f,0
} joined {
} left {
} partitioned {
b5e85121,0
fba34727,0
})
2015-10-02 20:21:40 19051 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2015-10-02 20:21:40 19051 [Note] WSREP: view((empty))
2015-10-02 20:21:40 19051 [Note] WSREP: gcomm: closed
2015-10-02 20:21:40 19051 [Note] WSREP: Flow-control interval: [1800, 1800]
2015-10-02 20:21:40 19051 [Note] WSREP: Received NON-PRIMARY.
2015-10-02 20:21:40 19051 [Note] WSREP: Shifting JOINER -> OPEN (TO: 5314176073)
2015-10-02 20:21:40 19051 [Note] WSREP: Received self-leave message.
2015-10-02 20:21:40 19051 [Note] WSREP: Flow-control interval: [1800, 1800]
2015-10-02 20:21:40 19051 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2015-10-02 20:21:40 19051 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 5314176073)
2015-10-02 20:21:40 19051 [Note] WSREP: RECV thread exiting 0: Success
2015-10-02 20:21:40 19051 [Note] WSREP: recv_thread() joined.
2015-10-02 20:21:40 19051 [Note] WSREP: Closing replication queue.
2015-10-02 20:21:40 19051 [Note] WSREP: Closing slave action queue.
2015-10-02 20:21:40 19051 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Aborted (core dumped)