Bug #1159837 “percona-xtradb-cluster-server-5.5 crash” : Bugs : Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

Laurynas Biveinis (laurynas-biveinis) on 2013-03-26

affects:

percona-server → percona-xtradb-cluster

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-03-26:

#1

The crash trace looks similar to the one described here -- https://groups.google.com/forum/?fromgroups=#!topic/codership-team/dD9-D8BETTU

@Niall,

Can you upload the error log in entirety?

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-03-26:

#2

apollo.err Edit (13.3 KiB, text/plain)

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2013-03-26:

#3

Unfortunately the log tells nothing.

1) Could you think of anything unusual (queries) executed on the cluster at the moment of crash?
2) Any reason why innodb_buffer_pool_size is only 128M?
3) Could you post the output of 'SHOW GLOBAL VARIABLES\G' from the crashed node?

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-03-27:

#4

apollo.txt Edit (49.8 KiB, text/plain)

1) I don't know what queries were taking place at that moment. The crashing node (apollo - node 3) isn't being directly used for anything except replication. Nor is node 2. Only the original node 1 (mustang) is being actively used. It's currently processing an average of 110 queries per second.

2) There's no reason apart from being the default. It should be more than 1GB as the innodb data size is 1.1GB.

3) I did restart mysql on node 3, which duly transferred all the data again and ran for about 12 hours before crashing for the second time. I've started the node as standalone to get the global variables attachment.

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-05-03:

#5

I upgraded the software to 5.5.30-23.7.4-405.squeeze on 21st April, re-synced to the cluster and it's been running without incident until today:

09:28:18 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=3
max_threads=153
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 342362 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x9904418
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = ffffffffea44a37c thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x33)[0x843c643]
/usr/sbin/mysqld(handle_fatal_signal+0x4bc)[0x82fa59c]
[0xf76f2400]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvjx+0xad)[0x81cc82d]
/usr/lib/libgalera_smm.so(+0x1a7664)[0xf4ca5664]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x25d)[0xf4cadbed]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x4b)[0xf4cb177b]
/usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_action+0x387)[0xf4c82da7]
/usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPv+0xe0)[0xf4c83540]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x8a)[0xf4ca731a]
/usr/lib/libgalera_smm.so(galera_recv+0x35)[0xf4cc79b5]
/usr/sbin/mysqld(_Z25wsrep_replication_processP3THD+0x50)[0x81cbb60]
/usr/sbin/mysqld(start_wsrep_THD+0x3c7)[0x8146247]
/lib/i686/cmov/libpthread.so.0(+0x5955)[0xf76d7955]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xf744a1de]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
130503 05:28:18 mysqld_safe Number of processes running now: 0
130503 05:28:18 mysqld_safe WSREP: not restarting wsrep node automatically
130503 05:28:18 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

I upgraded the software to 5.5.30-23.7.4-405.squeeze on 21st April, re-synced to the cluster and it's been running without incident until today:

09:28:18 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=3
max_threads=153
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 342362 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x9904418
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = ffffffffea44a37c thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x33)[0x843c643]
/usr/sbin/mysqld(handle_fatal_signal+0x4bc)[0x82fa59c]
[0xf76f2400]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvjx+0xad)[0x81cc82d]
/usr/lib/libgalera_smm.so(+0x1a7664)[0xf4ca5664]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x25d)[0xf4cadbed]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x4b)[0xf4cb177b]
/usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_action+0x387)[0xf4c82da7]
/usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPv+0xe0)[0xf4c83540]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x8a)[0xf4ca731a]
/usr/lib/libgalera_smm.so(galera_recv+0x35)[0xf4cc79b5]
/usr/sbin/mysqld(_Z25wsrep_replication_processP3THD+0x50)[0x81cbb60]
/usr/sbin/mysqld(start_wsrep_THD+0x3c7)[0x8146247]
/lib/i686/cmov/libpthread.so.0(+0x5955)[0xf76d7955]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xf744a1de]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
130503 05:28:18 mysqld_safe Number of processes running now: 0
130503 05:28:18 mysqld_safe WSREP: not restarting wsrep node automatically
130503 05:28:18 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-05-28:

#6

Download full text (3.2 KiB)

Our standalone node has now crashed (5.5.30-23.7.4-405.squeeze):

10:17:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=35
max_threads=153
thread_count=20
connection_count=20
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 343043 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fcad93c6f80
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fce0ea18e78 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7ed245]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6ba864]
/lib/libpthread.so.0(+0xeff0)[0x7fceb0f2eff0]
/usr/sbin/mysqld(my_b_safe_tell+0x11)[0x7da711]
/usr/sbin/mysqld(_ZN9Log_event12write_headerEP11st_io_cachem+0x118)[0x763408]
/usr/sbin/mysqld(_ZN15Query_log_event5writeEP11st_io_cache+0x328)[0x7666a8]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG5writeEP9Log_event+0x5be)[0x75749e]
/usr/sbin/mysqld(_ZN3THD12binlog_queryENS_22enum_binlog_query_typeEPKcmbbbi+0xb7)[0x57bde7]
/usr/sbin/mysqld(_ZN13select_insert8send_eofEv+0x140)[0x58bd10]
/usr/sbin/mysqld(_ZN13select_create8send_eofEv+0x1f)[0x58f24f]
/usr/sbin/mysqld[0x5d0bfe]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0xc62)[0x5e5ea2]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_un
itP13st_select_lex+0x12c)[0x5e766c]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1cd)[0x5e812d]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4cd8)[0x5a9328]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x343)[0x5a9d33]
/usr/sbin/mysqld[0x5aadd2]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1a92)[0x5acf72]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x167)[0x5ad567]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x14f)[0x64b4cf]
/usr/sbin/mysqld(handle_one_connection+0x51)[0x64b6b1]
/lib/libpthread.so.0(+0x68ca)[0x7fceb0f268ca]
/lib/libc.so.6(clone+0x6d)[0x7fceafbcfb6d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fcb6bb017a0): is an invalid pointer
Connection ID (thread ID): 1075420
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
130528 11:17:41 mysqld_safe Number of processes running now: 0
130528 11:17:41 mysqld_safe WSREP: not restarting wsrep node automatically
130528 11:17:41 ...

Our standalone node has now crashed (5.5.30-23.7.4-405.squeeze):

10:17:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=35
max_threads=153
thread_count=20
connection_count=20
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 343043 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fcad93c6f80
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fce0ea18e78 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7ed245]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6ba864]
/lib/libpthread.so.0(+0xeff0)[0x7fceb0f2eff0]
/usr/sbin/mysqld(my_b_safe_tell+0x11)[0x7da711]
/usr/sbin/mysqld(_ZN9Log_event12write_headerEP11st_io_cachem+0x118)[0x763408]
/usr/sbin/mysqld(_ZN15Query_log_event5writeEP11st_io_cache+0x328)[0x7666a8]
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG5writeEP9Log_event+0x5be)[0x75749e]
/usr/sbin/mysqld(_ZN3THD12binlog_queryENS_22enum_binlog_query_typeEPKcmbbbi+0xb7)[0x57bde7]
/usr/sbin/mysqld(_ZN13select_insert8send_eofEv+0x140)[0x58bd10]
/usr/sbin/mysqld(_ZN13select_create8send_eofEv+0x1f)[0x58f24f]
/usr/sbin/mysqld[0x5d0bfe]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0xc62)[0x5e5ea2]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_un
itP13st_select_lex+0x12c)[0x5e766c]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1cd)[0x5e812d]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4cd8)[0x5a9328]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x343)[0x5a9d33]
/usr/sbin/mysqld[0x5aadd2]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1a92)[0x5acf72]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x167)[0x5ad567]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x14f)[0x64b4cf]
/usr/sbin/mysqld(handle_one_connection+0x51)[0x64b6b1]
/lib/libpthread.so.0(+0x68ca)[0x7fceb0f268ca]
/lib/libc.so.6(clone+0x6d)[0x7fceafbcfb6d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fcb6bb017a0): is an invalid pointer
Connection ID (thread ID): 1075420
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
130528 11:17:41 mysqld_safe Number of processes running now: 0
130528 11:17:41 mysqld_safe WSREP: not restarting wsrep node automatically
130528 11:17:41 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-05-28:

#7

@Nial.

The latest crash looks like

https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1126207

and

https://bugs.launchpad.net/codership-mysql/+bug/1160854

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-05-28:

#8

yep, this was the probable culprit:

CREATE TEMPORARY TABLE tempFlexiTime (employee_id int unsigned, date_from date, time_from time, time_to time) as SELECT employee_id, date_from, time_from, time_to FROM webdb2.flexitime ORDER by employee_id ASC, date_from DESC

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-06-10:

#9

@Niall, you can watch https://bugs.launchpad.net/codership-mysql/+bug/1160854 for updates on the issue you reported in #6

For the original issue, I am marking this a duplicate of lp:1188641

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-06-12:

#10

@Niall, your first variable output shows that you are using binlog_format=STATEMENT. Have you later changed to ROW format ? Note that only ROW format is fully supported atm.

Revision history for this message

Niall Hallett (niall-hallett) wrote on 2013-07-02:

#11

apollo.err Edit (8.7 KiB, text/plain)

I've set the binlog_format = ROW on all the servers.

I've just upgraded one of the unused nodes to 5.5.31-23.7.5-438.squeeze and it won't even start without crashing. I emptied the /var/lib/mysql directory to see if that made any difference and attached the err log.

Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

percona-xtradb-cluster-server-5.5 crash

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches