Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

Random crash on the same node

Bug #1327763 reported by dannie on 2014-06-08

This bug affects 10 people

Affects		Status	Importance	Assigned to	Milestone
	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	Fix Released	High	Krunal Bauskar	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC 5.6.29-25.15

Bug Description

This is the third time the same node crashes.

First 2 times were consistent at 2am, I then upgraded from Ubuntu Precise to Trusty to see if that would change anything, and now I'm running with a compiled version of Xtradb Cluster 5.6.17-65.0 on both servers (third is a garbd instance).

Memory does not seem to be an issue http://puu.sh/9kaaQ/c0061882d1.png
Total memory 16GB.
innodb_buffer_pool_size = 8G

2014-06-08 13:03:19 27080 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 10486174)
2014-06-08 13:03:19 27080 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 10486174
2014-06-08 13:03:19 27080 [Warning] WSREP: Failed to apply app buffer: seqno: 10486174, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-06-08 13:03:19 27080 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 10486174)
2014-06-08 13:03:19 27080 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 10486174
2014-06-08 13:03:19 27080 [Warning] WSREP: Failed to apply app buffer: seqno: 10486174, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-06-08 13:03:19 27080 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 10486174)
2014-06-08 13:03:19 27080 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 10486174
2014-06-08 13:03:19 27080 [Warning] WSREP: Failed to apply app buffer: seqno: 10486174, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-06-08 13:03:19 27080 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 10486174)
2014-06-08 13:03:19 27080 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 10486174
2014-06-08 13:03:19 27080 [Warning] WSREP: failed to replay trx: source: 8739bc85-ec9d-11e3-86e7-6b013cd9f468 version: 3 local: 1 state: REPLAYING flags: 1 conn_id: 746343 trx_id: 4385353381 seqnos (l: 2978809, g: 10486174, s: 10486166,
2014-06-08 13:03:19 27080 [Warning] WSREP: Failed to apply trx 10486174 4 times
2014-06-08 13:03:19 27080 [ERROR] WSREP: trx_replay failed for: 6, query: void
2014-06-08 13:03:19 27080 [ERROR] Aborting

2014-06-08 13:03:21 27080 [Note] WSREP: killing local connection: 746345
2014-06-08 13:03:21 27080 [Note] WSREP: killing local connection: 746348
2014-06-08 13:03:21 27080 [Note] WSREP: killing local connection: 746349
2014-06-08 13:03:21 27080 [Note] WSREP: Closing send monitor...
2014-06-08 13:03:21 27080 [Note] WSREP: Closed send monitor.
2014-06-08 13:03:21 27080 [Note] WSREP: gcomm: terminating thread
2014-06-08 13:03:21 27080 [Note] WSREP: gcomm: joining thread
2014-06-08 13:03:21 27080 [Note] WSREP: gcomm: closing backend
2014-06-08 13:03:22 27080 [Note] WSREP: view(view_id(NON_PRIM,04544223-ed52-11e3-b428-eebb72a5cca2,50) memb {
        8739bc85-ec9d-11e3-86e7-6b013cd9f468,0
} joined {
} left {
} partitioned {
        04544223-ed52-11e3-b428-eebb72a5cca2,0
        09f6ccfa-e9c8-11e3-a608-6eeac628a08a,0
})
2014-06-08 13:03:22 27080 [Note] WSREP: view((empty))
2014-06-08 13:03:22 27080 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2014-06-08 13:03:22 27080 [Note] WSREP: gcomm: closed
2014-06-08 13:03:22 27080 [Note] WSREP: Flow-control interval: [16, 16]
2014-06-08 13:03:22 27080 [Note] WSREP: Received NON-PRIMARY.
2014-06-08 13:03:22 27080 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 10486223)
2014-06-08 13:03:22 27080 [Note] WSREP: Received self-leave message.
2014-06-08 13:03:22 27080 [Note] WSREP: Flow-control interval: [0, 0]
2014-06-08 13:03:22 27080 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2014-06-08 13:03:22 27080 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 10486223)
2014-06-08 13:03:22 27080 [Note] WSREP: RECV thread exiting 0: Success
2014-06-08 13:03:22 27080 [Note] WSREP: recv_thread() joined.
2014-06-08 13:03:22 27080 [Note] WSREP: Closing replication queue.
2014-06-08 13:03:22 27080 [Note] WSREP: Closing slave action queue.
2014-06-08 13:03:22 27080 [Note] WSREP: Service disconnected.
2014-06-08 13:03:22 27080 [Note] WSREP: rollbacker thread exiting
2014-06-08 13:03:23 27080 [Note] WSREP: Some threads may fail to exit.
2014-06-08 13:03:23 27080 [Note] Binlog end
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'partition'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CHANGED_PAGES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_METRICS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMPMEM'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_CMP'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_LOCKS'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'INNODB_TRX'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'XTRADB_RSEG'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'XTRADB_INTERNAL_HASH_TABLES'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'XTRADB_READ_VIEW'
2014-06-08 13:03:23 27080 [Note] Shutting down plugin 'InnoDB'
2014-06-08 13:03:23 27080 [Note] InnoDB: FTS optimize thread exiting.
2014-06-08 13:03:23 27080 [Note] InnoDB: Starting shutdown...
11:03:23 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=12582912
read_buffer_size=131072
max_used_connections=68
max_threads=502
thread_count=12
connection_count=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 212606 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x10a04b10
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb778271e20 thread_stack 0x30000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x2c)[0x93b21c]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x35f)[0x6967bf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fb9e97a9340]
/usr/local/mysql/bin/mysqld[0x6b716b]
/usr/local/mysql/bin/mysqld[0x6b8049]
/usr/local/mysql/bin/mysqld[0x6b831e]
/usr/local/mysql/bin/mysqld[0x6a9f59]
/usr/local/mysql/bin/mysqld[0x6a9474]
/usr/local/mysql/bin/mysqld(_Z16acl_authenticateP3THDj+0x1c3)[0x6bd773]
/usr/local/mysql/bin/mysqld[0x6eead1]
/usr/local/mysql/bin/mysqld(_Z16login_connectionP3THD+0x3d)[0x6f008d]
/usr/local/mysql/bin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6f0864]
/usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x137)[0x6f0b97]
/usr/local/mysql/bin/mysqld(handle_one_connection+0x39)[0x6f0de9]
/usr/local/mysql/bin/mysqld(pfs_spawn_thread+0x140)[0xb9abd0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fb9e97a1182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb9e8cae30d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 746356
Status: NOT_KILLED

Tags:

Revision history for this message

dannie (daniz) wrote on 2014-06-08:

Just to add, when starting the crashed node, it did a SST

140608 22:57:11 mysqld_safe WSREP: Recovered position fc34347d-e515-11e3-ab6a-5f5ace43d8d1:10486173
2014-06-08 22:57:11 0 [Note] WSREP: wsrep_start_position var submitted: 'fc34347d-e515-11e3-ab6a-5f5ace43d8d1:10486173'
2014-06-08 22:57:11 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2014-06-08 22:57:11 7727 [Warning] Using pre 5.5 semantics to load error messages from /usr/local/mysql/share/english/.
2014-06-08 22:57:11 7727 [Warning] If this is not intended, refer to the documentation for valid usage of --lc-messages-dir and --language parameters.
2014-06-08 22:57:11 7727 [Note] WSREP: Read nil XID from storage engines, skipping position init
2014-06-08 22:57:11 7727 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
2014-06-08 22:57:11 7727 [Note] WSREP: wsrep_load(): Galera 3.5(r178) by Codership Oy <email address hidden> loaded successfully.
2014-06-08 22:57:11 7727 [Note] WSREP: CRC-32C: using hardware acceleration.
2014-06-08 22:57:11 7727 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
2014-06-08 22:57:11 7727 [Note] WSREP: Passing config to GCS: base_host = 10.41.172.66; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT
2014-06-08 22:57:11 7727 [Note] WSREP: Service thread queue flushed.

Revision history for this message

dannie (daniz) wrote on 2014-06-09:

Download full text (5.4 KiB)

Same thing happened today:

2014-06-09 10:03:41 7727 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 11354082)
2014-06-09 10:03:41 7727 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 11354082
2014-06-09 10:03:41 7727 [Warning] WSREP: Failed to apply app buffer: seqno: 11354082, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-06-09 10:03:41 7727 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 11354082)
2014-06-09 10:03:41 7727 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 11354082
2014-06-09 10:03:41 7727 [Warning] WSREP: Failed to apply app buffer: seqno: 11354082, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-06-09 10:03:41 7727 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 11354082)
2014-06-09 10:03:41 7727 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 11354082
2014-06-09 10:03:41 7727 [Warning] WSREP: Failed to apply app buffer: seqno: 11354082, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-06-09 10:03:41 7727 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 11354082)
2014-06-09 10:03:41 7727 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 11354082
2014-06-09 10:03:41 7727 [Warning] WSREP: failed to replay trx: source: 762da368-ef4f-11e3-935b-f3bb661a8671 version: 3 local: 1 state: REPLAYING flags: 1 conn_id: 118117 trx_id: 4430490127 seqnos (l: 302245, g: 11354082, s: 11354076, d: 11354081, ts: 346572660163671)
2014-06-09 10:03:41 7727 [Warning] WSREP: Failed to apply trx 11354082 4 times
2014-06-09 10:03:41 7727 [ERROR] WSREP: trx_replay failed for: 6, query: void
2014-06-09 10:03:41 7727 [ERROR] Aborting

Same thing happened today:

2014-06-09 10:03:43 7727 [Note] WSREP: killing local connection: 118122
2014-06-09 10:03:43 7727 [Note] WSREP: killing local connection: 118120
2014-06-09 10:03:43 7727 [Note] WSREP: Closing send monitor...
2014-06-09 10:03:43 7727 [Note] WSREP: Closed send monitor.
2014-06-09 10:03:43 7727 [Note] WSREP: gcomm: terminating thread
2014-06-09 10:03:43 7727 [Note] WSREP: gcomm: joining thread
2014-06-09 10:03:43 7727 [Note] WSREP: gcomm: closing backend
2014-06-09 10:03:44 7727 [Note] WSREP: view(view_id(NON_PRIM,04544223-ed52-11e3-b428-eebb72a5cca2,52) memb {
        762da368-ef4f-11e3-935b-f3bb661a8671,0
} joined {
} left {
} partitioned {
        04544223-ed52-11e3-b428-eebb72a5cca2,0
        09f6ccfa-e9c8-11e3-a608-6eeac628a08a,0
})
2014-06-09 10:03:44 7727 [Note] WSREP: view((empty))
2014-06-09 10:03:44 7727 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2014-06-09 10:03:44 7727 [Note] WSREP: gcomm: closed
2014-06-09 10:03:44 7727 [Note] WSREP: Flow-control interval: [16, 16]
2014-06-09 10:03:44 7727 [Note] WSREP: Received NON-PRIMARY.
2014-06-09 10:03:44 7727 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 11354114)
2014-06-09 10:03:44 7727 [Note] WSREP: Received self-leave message.
2014-06-09 10:03:44 7727 [Note] WSREP: Flow-control interval: [0, 0]
2014-06-09 10:03:44 7727 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2014-06-09 10:03:44 7727 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 11354114)
2014-06-09 10:03:44 7727 [Note] WSREP: RECV thread exiting 0: Success
2014-06-09 10:03:44 7727 [Note] WSREP: recv_thread() joined.
2014-06-09 10:03:44 7727 [Note] WSREP: Closing replication queue.
2014-06-09 10:03:44 7727 [Note] WSREP: Closing slave action queue.
2014-06-09 10:03:44 7727 [Note] WSREP: Service disconnected.
2014-06-09 10:03:44 7727 [Note] WSREP: rollbacker thread exiting
2014-06-09 10:03:45 7727 [Note] WSREP: Some threads may fail to exit.
2014-06-09 10:03:45 7727 [Note] Binlog end
2014-06-09 10:03:45 7727 [Note] Shutting down plugin 'partition'
....

key_buffer_size=12582912
read_buffer_size=131072
max_used_connections=27
max_threads=502
thread_count=12
connection_count=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 212606 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xf815830
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f33d4fe1e20 thread_stack 0x30000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x2c)[0x93b21c]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x35f)[0x6967bf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f33d5bd9340]
/usr/local/mysql/bin/mysqld[0x6b716b]
/usr/local/mysql/bin/mysqld[0x6b8049]
/usr/local/mysql/bin/mysqld[0x6b831e]
/usr/local/mysql/bin/mysqld[0x6a9f59]
/usr/local/mysql/bin/mysqld[0x6a9474]
/usr/local/mysql/bin/mysqld(_Z16acl_authenticateP3THDj+0x1c3)[0x6bd773]
/usr/local/mysql/bin/mysqld[0x6eead1]
/usr/local/mysql/bin/mysqld(_Z16login_connectionP3THD+0x3d)[0x6f008d]
/usr/local/mysql/bin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6f0864]
/usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x137)[0x6f0b97]
/usr/local/mysql/bin/mysqld(handle_one_connection+0x39)[0x6f0de9]
/usr/local/mysql/bin/mysqld(pfs_spawn_thread+0x140)[0xb9abd0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f33d5bd1182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f33d50de30d]

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-06-13:

@dannie,

Can you resolve the stack trace with debuginfo installed? It would be good to
get a bt, bt full and thread apply all bt full if you can obtain a core dump.

Revision history for this message

dannie (daniz) wrote on 2014-06-16:

Hi,

I can try if you give me some more details how I would go about doing this.

Also want to mention that it has not happened the last couple of days for some reason. Could be cause I killed percona-agent, but it's very hard to say. If you let me know how I can add more debug info, I'll set it up to try to capture it next time it happens.

Revision history for this message

dannie (daniz) wrote on 2014-06-19:

Same again,

hread pointer: 0xf8ba300
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f681f7e5e20 thread_stack 0x30000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x2c)[0x93b21c]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x35f)[0x6967bf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f6a93340340]
/usr/local/mysql/bin/mysqld[0x6b716b]
/usr/local/mysql/bin/mysqld[0x6b8049]
/usr/local/mysql/bin/mysqld[0x6b831e]
/usr/local/mysql/bin/mysqld[0x6a9f59]
/usr/local/mysql/bin/mysqld[0x6a9474]
/usr/local/mysql/bin/mysqld(_Z16acl_authenticateP3THDj+0x1c3)[0x6bd773]
/usr/local/mysql/bin/mysqld[0x6eead1]
/usr/local/mysql/bin/mysqld(_Z16login_connectionP3THD+0x3d)[0x6f008d]
/usr/local/mysql/bin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6f0864]
/usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x137)[0x6f0b97]
/usr/local/mysql/bin/mysqld(handle_one_connection+0x39)[0x6f0de9]
/usr/local/mysql/bin/mysqld(pfs_spawn_thread+0x140)[0xb9abd0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f6a93338182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f6a9284530d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 1760219
Status: NOT_KILLED

more info on the debuginfo would be great.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-06-19:

@dannie,

Do you have an empty users table? Please check https://bugs.launchpad.net/percona-server/+bug/1259833 (last few comments). I will merge this bug into that if so.

Revision history for this message

dannie (daniz) wrote on 2014-06-20:

sadly no

ls -lh mysql/user*
-rw-rw---- 1 mysql mysql 11K Jun 19 17:33 mysql/user.frm
-rw-rw---- 1 mysql mysql 1.2K Jun 19 17:33 mysql/user.MYD
-rw-rw---- 1 mysql mysql 2.0K Jun 19 17:33 mysql/user.MYI

Revision history for this message

dannie (daniz) wrote on 2014-07-17:

Should I start mysql in gdb, or compile it with some debug mode first?

Revision history for this message

Bert Hutzler (bhutzler) wrote on 2014-11-27:

log file excerpt Edit (8.7 KiB, text/plain)

We have similar issues, except that it were different nodes that crashed randomly.
Just updatet this morning to Percona-XtraDB-Cluster-full-56.x86_64 1:5.6.21-25.8.938.el6 and hoped so to get rid of that crashes.
But about 2 hours after the update it happened again.
I attached the relevant log lines.
I'd happily provide you with more information as you may request.
This is a 3 node cluster on CentOS 6.5.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-11-27:

#10

@Bert, @dannie,

Can you enable coredumps for mysql and provide backtrace from that. You can use https://gist.github.com/ronin13/8d403c2f88826fcc7aba as the gdb script for it.

Also, general log (decoded binlog if you have it), my.cnf and any steps to reproduce this would be great.

Also, do you run any prepared statements? http://bugs.mysql.com/bug.php?id=42041 has some details and workarounds for 1615 (ER_NEED_REPREPARE) - such as increasing table_definition_cache.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-11-27:

#11

You can use http://www.percona.com/blog/2011/08/26/getting-mysql-core-file-on-linux/ to get coredump.

Make sure to install debuginfo package for this.

Also, provide complete error log if possible.

Revision history for this message

Bert Hutzler (bhutzler) wrote on 2014-12-01:

#12

gdb.txt Edit (33.6 KiB, text/plain)

Uhh, this was not quit trivial for me to figure out. To help followers, I recommend the following site to learn how to get a core dump on RedHat aka CentOS: http://www.unixmen.com/how-to-enable-core-dumps-in-rhel6/
This worked for me.
I applied this on a test machine, and after also figuring out how to use that gdb script, I got the attachement.

Before sending you the whole thing from a crashing production machine, I'd like to ask, if this is, what you expect or if I have to use another mysqld binary because of the messages of the missing symbols. Meanwhile here's only the gdb.txt

debuginfo package for XtraDB Cluster is installed. Do I have to do sth. to kind of activate this or so?

Revision history for this message

Bert Hutzler (bhutzler) wrote on 2014-12-08:

#13

It's mysterious. Now we run a 5 node cluster. The first one crashed yesterday 7 times, while all the others run approximately 4 days without interruption...
I'd like to provide you guys with more debug info. So I'm waiting for feedback on my last post...

Revision history for this message

dannie (daniz) wrote on 2015-03-22:

#14

crash.log Edit (11.6 KiB, text/plain)

I'm back with some new info. After the last update to 5.6.22-72.0-56 every node crashes within 3-4h, and at random times. After some help from the percona crew to get the debug info this is what I got, hope it helps.

Let me know if there is anything else I can post

Revision history for this message

Bert Hutzler (bhutzler) wrote on 2015-05-11: Re: [Bug 1327763] Re: Random crash on the same node

#15

Thank you for your answer, and sorry for being so late :(
but we're very occupied... ...maybe you're also familiar with this
situation: more work, less workers...

Well, I think, we already could produce the right output, and also being
in contact with percona, they provided a bugfix, which seems to work.
All in all i think it's a combination of several factors, that we could
reduce the crashes to a very minimum of about less than once a week on
one of the nodes.

What we did is:
1. put the database on SSDs
2. a dedicated switch for synchronization of the nodes
3. doing writes mainly on only one node

I will put this info also on launchpad,
so maybe some other people can benefit from...

Mit freundlichen Grüßen (kind regards)

Bert Hutzler

Quoka GmbH
EDV Abteilung
Chemiestraße 14-15
68623 Lampertheim

Telefon: +49 (0) 6206 / 931-128
Telefax: +49 (0) 6206 / 931-149
E-Mail: mailto:<email address hidden>
Internet: http://www.quoka.de

AG Darmstadt - Reg. Abt. Lampertheim - HRB 61482 - USt-ID-Nr. DE 175 723 521 - Geschäftsführung: Patrick Günter

On 22.03.2015 22:14, dannie wrote:
> I'm back with some new info. After the last update to 5.6.22-72.0-56
> every node crashes within 3-4h, and at random times. After some help
> from the percona crew to get the debug info this is what I got, hope it
> helps.
>
> Let me know if there is anything else I can post
>
>
> ** Attachment added: "crash.log"
> https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1327763/+attachment/4352525/+files/crash.log
>

Revision history for this message

Davestj (davestj) wrote on 2015-06-19:

#16

Download full text (15.9 KiB)

I am also experiencing this issue, would be interesting to find the cause or if this can be accepted and worked.

Cluster details

Percona XtraDB galera cluster x3 nodes each node 30gb RAM

Our rpm list
| <email address hidden>:/rootdba -> rpm -qa | grep Percona
Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64
Percona-XtraDB-Cluster-client-56-5.6.24-25.11.1.el6.x86_64
Percona-XtraDB-Cluster-galera-3-3.11-1.rhel6.x86_64
Percona-XtraDB-Cluster-server-56-5.6.24-25.11.1.el6.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.24-25.11.1.el6.x86_64

OS Version: CentOS release 6.6 (Final)

Crash appears to be random but consistent, needs manual start to recover.
So far the whole cluster has not entirely crashed, the other 2 nodes have continued and allows rejoin after the crashed node is manually started, sometimes the first start attempt fails and a 2nd "service mysql start" command is needed, which then detects the stale sst file and re-initializes things.

Mysql error Log snapshot
----------------
2015-06-18 18:26:13 18275 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 1704468)
2015-06-18 18:26:13 18275 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 1704468
2015-06-18 18:26:13 18275 [Warning] WSREP: Failed to apply app buffer: seqno: 1704468, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2015-06-18 18:26:13 18275 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 1704468)
2015-06-18 18:26:13 18275 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 1704468
2015-06-18 18:26:13 18275 [Warning] WSREP: Failed to apply app buffer: seqno: 1704468, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2015-06-18 18:26:13 18275 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 1704468)
2015-06-18 18:26:13 18275 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 1704468
2015-06-18 18:26:13 18275 [Warning] WSREP: Failed to apply app buffer: seqno: 1704468, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2015-06-18 18:26:13 18275 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 1704468)
2015-06-18 18:26:13 18275 [Warning] WSREP: RBR event 3 Update_rows apply warning: 1615, 1704468
2015-06-18 18:26:13 18275 [Warning] WSREP: failed to replay trx: source: 401397a7-15b4-11e5-8b99-87d715e86d59 version: 3 local: 1 state: REPLAYING flags: 1 con
n_id: 11318 trx_id: 16121482 seqnos (l: 18306, g: 1704468, s: 1704466, d: 1704467, ts: 3186938643214857)
2015-06-18 18:26:13 18275 [Warning] WSREP: Failed to apply trx 1704468 4 times
2015-06-18 18:26:13 18275 [ERROR] WSREP: trx_replay failed for: 6, query: void
2015-06-18 18:26:13 18275 [ERROR] Aborting

I am also experiencing this issue, would be interesting to find the cause or if this can be accepted and worked.

Cluster details

Percona XtraDB galera cluster x3 nodes each node 30gb RAM

Our rpm list
| rootdba@db.local:/rootdba -> rpm -qa | grep Percona
Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64
Percona-XtraDB-Cluster-client-56-5.6.24-25.11.1.el6.x86_64
Percona-XtraDB-Cluster-galera-3-3.11-1.rhel6.x86_64
Percona-XtraDB-Cluster-server-56-5.6.24-25.11.1.el6.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.24-25.11.1.el6.x86_64

OS Version: CentOS release 6.6 (Final)

2015-06-18 18:26:15 18275 [Note] WSREP: Closing send monitor...
2015-06-18 18:26:15 18275 [Note] WSREP: Closed send monitor.
2015-06-18 18:26:15 18275 [Note] WSREP: gcomm: terminating thread
2015-06-18 18:26:15 18275 [Note] WSREP: gcomm: joining thread
2015-06-18 18:26:15 18275 [Note] WSREP: gcomm: closing backend
2015-06-18 18:26:15 18275 [Note] WSREP: view(view_id(NON_PRIM,157a25f2,289) memb {
401397a7,0
} joined {
} left {
} partitioned {
157a25f2,0
c48d73a9,0
})
2015-06-18 18:26:15 18275 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2015-06-18 18:26:15 18275 [Note] WSREP: view((empty))
2015-06-18 18:26:15 18275 [Note] WSREP: gcomm: closed
2015-06-18 18:26:15 18275 [Note] WSREP: Flow-control interval: [16, 16]
2015-06-18 18:26:15 18275 [Note] WSREP: Received NON-PRIMARY.
2015-06-18 18:26:15 18275 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 1704469)
2015-06-18 18:26:15 18275 [Note] WSREP: Received self-leave message.
2015-06-18 18:26:15 18275 [Note] WSREP: Flow-control interval: [0, 0]
2015-06-18 18:26:15 18275 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2015-06-18 18:26:15 18275 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 1704469)
2015-06-18 18:26:15 18275 [Note] WSREP: RECV thread exiting 0: Success
2015-06-18 18:26:15 18275 [Note] WSREP: recv_thread() joined.
2015-06-18 18:26:15 18275 [Note] WSREP: Closing replication queue.
2015-06-18 18:26:15 18275 [Note] WSREP: Closing slave action queue.
2015-06-18 18:26:15 18275 [Note] WSREP: Service disconnected.
2015-06-18 18:26:15 18275 [Note] WSREP: rollbacker thread exiting
2015-06-18 18:26:16 18275 [Note] WSREP: Some threads may fail to exit.
2015-06-18 18:26:16 18275 [Note] Binlog end
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'partition'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'ARCHIVE'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'BLACKHOLE'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CHANGED_PAGES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_METRICS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMPMEM'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_CMP'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_LOCKS'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'INNODB_TRX'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'XTRADB_RSEG'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'XTRADB_INTERNAL_HASH_TABLES'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'XTRADB_READ_VIEW'
2015-06-18 18:26:16 18275 [Note] Shutting down plugin 'InnoDB'
2015-06-18 18:26:16 18275 [Note] InnoDB: FTS optimize thread exiting.
2015-06-18 18:26:16 18275 [Note] InnoDB: Starting shutdown...
18:26:16 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=3
max_threads=5002
thread_count=3
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2005108 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x1b351690
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa17a357d38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8fa965]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x665644]
/lib64/libpthread.so.0[0x33d7c0f710]
/usr/sbin/mysqld(_Z14ip_to_hostnameP16sockaddr_storagePKcPPc Pj+0xd5)[0x5ad9a5]
/usr/sbin/mysqld[0x6bb14c]
/usr/sbin/mysqld(_Z16login_connectionP3THD+0x4d)[0x6bb2ad]
/usr/sbin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6bb4a4]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x152)[0x6bc502]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x6bc717]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xaf611a]
/lib64/libpthread.so.0[0x33d7c079d1]
/lib64/libc.so.6(clone+0x6d)[0x33d78e88fd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 11320
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150618 18:26:17 mysqld_safe Number of processes running now: 0
150618 18:26:17 mysqld_safe WSREP: not restarting wsrep node automatically
150618 18:26:17 mysqld_safe mysqld from pid file /var/lib/mysql/db.local.pid ended
150618 18:32:26 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150618 18:32:26 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.xmMS4A' --pid-file='/var/lib/mysql/db.local-recover.pid'
2015-06-18 18:32:27 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
150618 18:32:33 mysqld_safe WSREP: Recovered position 16b1acf5-ed01-11e4-8ad4-dfefd145c7ec:1704467
Log of wsrep recovery (--wsrep-recover):
2015-06-18 18:32:27 32035 [Note] Plugin 'FEDERATED' is disabled.
2015-06-18 18:32:27 32035 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-06-18 18:32:27 32035 [Note] InnoDB: The InnoDB memory heap is disabled
2015-06-18 18:32:27 32035 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-06-18 18:32:27 32035 [Note] InnoDB: Memory barrier is not used
2015-06-18 18:32:27 32035 [Note] InnoDB: Compressed tables use zlib 1.2.3
2015-06-18 18:32:27 32035 [Note] InnoDB: Using Linux native AIO
2015-06-18 18:32:27 32035 [Note] InnoDB: Using CPU crc32 instructions
2015-06-18 18:32:27 32035 [Note] InnoDB: Initializing buffer pool, size = 14.0G
2015-06-18 18:32:29 32035 [Note] InnoDB: Completed initialization of buffer pool
2015-06-18 18:32:29 32035 [Note] InnoDB: Highest supported file format is Barracuda.
2015-06-18 18:32:29 32035 [Note] InnoDB: The log sequence numbers 23909010087 and 23909010087 in ibdata files do not match the log sequence number 23950849320 in the ib_logfiles!
2015-06-18 18:32:29 32035 [Note] InnoDB: Database was not shutdown normally!
2015-06-18 18:32:29 32035 [Note] InnoDB: Starting crash recovery.
2015-06-18 18:32:29 32035 [Note] InnoDB: Reading tablespace information from the .ibd files...
2015-06-18 18:32:29 32035 [Note] InnoDB: Restoring possible half-written data pages 
2015-06-18 18:32:29 32035 [Note] InnoDB: from the doublewrite buffer...
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 16121856
InnoDB: Last MySQL binlog file position 0 15334897, file name mysql-bin.000001
2015-06-18 18:32:30 32035 [Note] InnoDB: 128 rollback segment(s) are active.
InnoDB: Starting in background the rollback of uncommitted transactions
2015-06-18 18:32:30 7f61db6ab700  InnoDB: Rolling back trx with id 16121486, 1 rows to undo
2015-06-18 18:32:30 32035 [Note] InnoDB: Waiting for purge to start
2015-06-18 18:32:30 32035 [Note] InnoDB: Rollback of trx with id 16121486 completed
2015-06-18 18:32:30 7f61db6ab700  InnoDB: Rollback of non-prepared transactions completed
2015-06-18 18:32:30 32035 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.22-72.0 started; log sequence number 23950849320
2015-06-18 18:32:30 32035 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
2015-06-18 18:32:30 32035 [Note] WSREP: Recovered position: 16b1acf5-ed01-11e4-8ad4-dfefd145c7ec:1704467
2015-06-18 18:32:30 32035 [Note] Binlog end
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'partition'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'ARCHIVE'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'BLACKHOLE'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CHANGED_PAGES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_METRICS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMPMEM'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_CMP'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_LOCKS'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'INNODB_TRX'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'XTRADB_RSEG'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'XTRADB_INTERNAL_HASH_TABLES'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'XTRADB_READ_VIEW'
2015-06-18 18:32:30 32035 [Note] Shutting down plugin 'InnoDB'
2015-06-18 18:32:30 32035 [Note] InnoDB: FTS optimize thread exiting.
2015-06-18 18:32:30 32035 [Note] InnoDB: Starting shutdown...
2015-06-18 18:32:33 32035 [Note] InnoDB: Shutdown completed; log sequence number 23950855349
2015-06-18 18:32:33 32035 [Note] Shutting down plugin 'CSV'
2015-06-18 18:32:33 32035 [Note] Shutting down plugin 'MyISAM'
2015-06-18 18:32:33 32035 [Note] Shutting down plugin 'MRG_MYISAM'
2015-06-18 18:32:33 32035 [Note] Shutting down plugin 'MEMORY'
2015-06-18 18:32:33 32035 [Note] Shutting down plugin 'sha256_password'

Revision history for this message

dannie (daniz) wrote on 2015-06-21:

#17

I still have an occasional random shutdown, no one has commented on my debug-crash-log, not sure the bug is still seen as active or important.. Random crashes right, who cares :D

In all seriousness, random crashes can be hard to debug, but a comment what we can do to help more, or a notice of my log dump would be great.

Revision history for this message

Bert Hutzler (bhutzler) wrote on 2015-06-22:

#18

As i already stated, we run a 5 node cluster with databases on SSDs, dedicated switch for sync and doing writes mainly on only one node.
Actually we're running Percona-XtraDBCluster 5.6.22 25.8.978.el6,
and still some nodes crash randomly but rather rarely (about once or twice a week)
I handled this with a script running every minute and restarting mysql if it's crashed...

We're productive and this runs ok for us, although we'd be glad of course getting totally rid of the random crashes...
And we're planning to update this week to the actual release 5.6.24-25.11.1.el6 this week.

I'll report changes, hopefully improvements after updating...

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2016-02-18:

#19

commit d4fa27b020df425a6421ae12a18de9306cf5c90d
Merge: 9d30f12 f06d17f
Author: Krunal Bauskar <email address hidden>
Date: Thu Feb 18 06:35:17 2016 +0530

Merge pull request #90 from kbauskar/5.6-pxc-497

- PXC#497: Warning: "BF applier failed to open_and_lock_tables"

commit f06d17fd75c2aee6dbdbecd81e0795be1017284a
Author: Krunal Bauskar <email address hidden>
Date: Mon Feb 15 22:15:47 2016 +0530

- PXC#497: Warning: "BF applier failed to open_and_lock_tables"

Issue:
-----

Let me explain it with an example. Would be simpler.

Let's assume there are 2 pxc nodes and each node and following
workload is being

node-1: executes already prepare stmt and about to commit
(update i > 2 and i < 5)

node-2: executes some stmt that would conflict with node-1 (insert (3))
inserting in gap.

      As per the protocol local trx is selected for rollback and so
      when node-1 get node-2 trx which is applier trx node-1 local trx that
      is prepare stmt execution is rolled back and re-tried using replay trx.

      Note: retried using replay trx effectively means only write-set
      is applied this makes it bit faster.
      Application of write-set uses the apply-log-event path which
      open and locks table so there is no need to keep track of
      table versioning change which is normally done (in normal path)
      by installing a reprepare observer.

Solution:
---------

* Avoid installing observer when prepare trx is being replayed
as trx will open and lock the tables.

Changed in percona-xtradb-cluster:
assignee:	nobody → Krunal Bauskar (krunal-bauskar)
importance:	Undecided → High
status:	New → Fix Committed

Przemek (pmalkowski) on 2016-03-07

tags:

added: i66365

Hrvoje Matijakovic (hrvojem) on 2016-05-16

Changed in percona-xtradb-cluster:
milestone:	none → 5.6.29-25.15

Hrvoje Matijakovic (hrvojem) on 2016-05-20

Changed in percona-xtradb-cluster:
status:	Fix Committed → Fix Released

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

#20

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1007

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

mysql-bugs #42041 Edit

Bug watches keep track of this bug in other bug trackers.