Node crash in a two nodes cluster + garbd

Bug #1206939 reported by Gabriel Féron
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
New
Undecided
Unassigned

Bug Description

One of the node of my cluster crashed three times in 24h. I previously encountered https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1134892, which is now fixed, but I'm now facing this new issue. mysld still crashes "frequently" on both nodes (at least one node once a week). If the following stacktrace is not helpful, how can I get more information to help identify the source of the problem?

Here is the stacktrace from the latest crash:

130731 12:50:09 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000000 of size 134217728 bytes
130731 12:50:23 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000000
130731 12:50:46 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000001 of size 134217728 bytes
130731 12:51:03 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000001
130731 15:37:20 [Warning] WSREP: could not find key '1 6c65706170655f636f6d5f70726f64 637573746f6d65725f616464726573735f656e74697479 0074d60100 ' from cert index
130731 15:37:20 [Warning] WSREP: could not find key '1 6c65706170655f636f6d5f70726f64 637573746f6d65725f616464726573735f656e74697479 ' from cert index
130731 15:37:20 [ERROR] WSREP: FSM: no such a transition EXECUTING -> COMMITTED
13:37:20 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=536870912
read_buffer_size=8388608
max_used_connections=40
max_threads=502
thread_count=30
connection_count=30
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 12867784 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x14c65710
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fc1281cae70 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7dbebe]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6b1fc4]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fc15ac57cb0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fc15a286425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7fc15a289b8b]
/usr/lib/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x18d)[0x7fc158b7fd0d]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11post_commitEPNS_9TrxHandleE+0xd3)[0x7fc158b75763]
/usr/lib/libgalera_smm.so(galera_post_commit+0x63)[0x7fc158b8e2f3]
/usr/sbin/mysqld(_Z25wsrep_cleanup_transactionP3THD+0xb7)[0x6673e7]
/usr/sbin/mysqld[0x6b39b6]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x4ee)[0x6b505e]
/usr/sbin/mysqld(_Z12trans_commitP3THD+0x45)[0x65d5f5]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1efa)[0x5ab3ba]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x314)[0x5b1144]
/usr/sbin/mysqld[0x5b1a68]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1af0)[0x5b3b70]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x167)[0x5b3fc7]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x14f)[0x64fdff]
/usr/sbin/mysqld(handle_one_connection+0x51)[0x64ffd1]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc15ac4fe9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc15a343ccd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fbd9c012190): is an invalid pointer
Connection ID (thread ID): 71492
Status: NOT_KILLED

Revision history for this message
stefuNz (stefan-neuhaus-deactivatedaccount) wrote :

This bug affects me too, look this bug report https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1217418

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Looks similar to one reported and discussed here: https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1123233/comments/18

If yes, it has been fixed there.

Changed in percona-xtradb-cluster:
milestone: none → 5.5.33-23.7.6
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@Stefan,

Few cases of this crash have been fixed in 5.5.33, it is already available in experimental repo, can you test it and let us know?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.