Server crash when "START TRANSACTION WITH CONSISTENT SNAPSHOT" / mysqldump --single-transaction is executed while binlog is disabled

Bug #1353644 reported by Daniel Guzmán Burgos
56
This bug affects 8 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Invalid
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
Critical
Alexey Kopytov

Bug Description

This applies to PXC 5.6.19-67.0-56-log Percona XtraDB Cluster (GPL), Release 25.6, wsrep_25.6.r4111

When binlog is disabled, and START TRANSACTION WITH CONSISTENT SNAPSHOT" is executed, the server crash with the following stack:

18:47:24 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=0
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61058 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f56b8b65eb0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f568ed24e00 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7f56b61d404c]
/usr/sbin/mysqld(handle_fatal_signal+0x3cb)[0x7f56b5f1aa2b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f56b43c5340]
/usr/sbin/mysqld(+0x6e4cd7)[0x7f56b617ecd7]
/usr/sbin/mysqld(+0x3a8a5a)[0x7f56b5e42a5a]
/usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x1dd)[0x7f56b5fbeb8d]
/usr/sbin/mysqld(_Z28ha_start_consistent_snapshotP3THD+0x281)[0x7f56b5e47a51]
/usr/sbin/mysqld(_Z11trans_beginP3THDj+0x100)[0x7f56b6045910]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2b9b)[0x7f56b5fae0fb]
/usr/sbin/mysqld(+0x519ee8)[0x7f56b5fb3ee8]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1280)[0x7f56b5fb5680]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x16c)[0x7f56b5fb78fc]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x31d)[0x7f56b5f78d3d]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x7f56b5f78de0]
/usr/sbin/mysqld(pfs_spawn_thread+0x140)[0x7f56b644cfb0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f56b43bd182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f56b38ca30d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f5670004f80): is an invalid pointer
Connection ID (thread ID): 3
Status: NOT_KILLED

When binlog is enabled, the server doesn't crash

This applies only on PXC. Percona Server 5.6 doesn't seem to be affected by this issue.
Could be related with the following enhancement? https://blueprints.launchpad.net/percona-server/+spec/enhancements-for-start-transaction-with-consistent

PXC 5.6.15 doesn't seem to be affected

affects: percona-server → percona-xtradb-cluster
Revision history for this message
Daniel Guzmán Burgos (nethalo) wrote :

More info:

When using mysqldump with --single-transaction, the START TRANSACTION WITH CONSISTENT SNAPHOST is executed, thus server crash:

root@daniel-vbox:/etc/mysql# mysqldump --version
mysqldump Ver 10.13 Distrib 5.6.19-67.0, for debian-linux-gnu (x86_64)

root@daniel-vbox:/etc/mysql# mysqldump --single-transaction dani -uroot -ptoor
Warning: Using a password on the command line interface can be insecure.
-- MySQL dump 10.13 Distrib 5.6.19-67.0, for debian-linux-gnu (x86_64)
--
-- Host: localhost Database: dani
-- ------------------------------------------------------
-- Server version 5.6.19-67.0-56

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
mysqldump: Couldn't execute 'START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */': Lost connection to MySQL server during query (2013)

 mysql> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | OFF |
+---------------+-------+
1 row in set (0,00 sec)

Stack:
Thread pointer: 0x7f218681c2a0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f215c89ee00 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7f2183d7c04c]
/usr/sbin/mysqld(handle_fatal_signal+0x3cb)[0x7f2183ac2a2b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f2181f6d340]
/usr/sbin/mysqld(+0x6e4cd7)[0x7f2183d26cd7]
/usr/sbin/mysqld(+0x3a8a5a)[0x7f21839eaa5a]
/usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x1dd)[0x7f2183b66b8d]
/usr/sbin/mysqld(_Z28ha_start_consistent_snapshotP3THD+0x281)[0x7f21839efa51]
/usr/sbin/mysqld(_Z11trans_beginP3THDj+0x100)[0x7f2183bed910]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2b9b)[0x7f2183b560fb]
/usr/sbin/mysqld(+0x519ee8)[0x7f2183b5bee8]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1280)[0x7f2183b5d680]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x16c)[0x7f2183b5f8fc]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x31d)[0x7f2183b20d3d]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x7f2183b20de0]
/usr/sbin/mysqld(pfs_spawn_thread+0x140)[0x7f2183ff4fb0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f2181f65182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f218147230d]

http://www.percona.com/doc/percona-server/5.6/management/start_transaction_with_consistent_snapshot.html#mysqldump Documentation specifies that STWCS is used when --single-transaction AND --master-data are used, but i was able to reproduce this only with --single-transaction

summary: - Server crash when "START TRANSACTION WITH CONSISTENT SNAPSHOT" is
- executed while binlog is disable
+ Server crash when "START TRANSACTION WITH CONSISTENT SNAPSHOT" /
+ mysqldump --single-transaction is executed while binlog is disabled
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

(gdb) bt
#0 0x000000000092a2a5 in MYSQL_BIN_LOG::raw_get_current_log (this=this@entry=0x1570dc0 <mysql_bin_log>, linfo=0x7f28ac00d8a0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/binlog.cc:3547
#1 0x000000000092f865 in binlog_start_consistent_snapshot (hton=0x1cd60e0, thd=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/binlog.cc:1391
#2 0x000000000065336d in start_snapshot_handlerton (thd=thd@entry=0x351d0c0, plugin=plugin@entry=0x7f28ffe39288, arg=arg@entry=0x7f28ffe3930f)
    at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/handler.cc:2424
#3 0x0000000000793cbf in plugin_foreach_with_mask (thd=thd@entry=0x351d0c0, func=func@entry=0x653340 <start_snapshot_handlerton(THD*, plugin_ref, void*)>, type=type@entry=1, state_mask=4294967287,
    state_mask@entry=8, arg=arg@entry=0x7f28ffe3930f) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_plugin.cc:2094
#4 0x00000000006607f4 in ha_start_consistent_snapshot (thd=thd@entry=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/handler.cc:2447
#5 0x0000000000811878 in trans_begin (thd=thd@entry=0x351d0c0, flags=<optimized out>) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/transaction.cc:207
#6 0x0000000000786a86 in mysql_execute_command (thd=thd@entry=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_parse.cc:5051
#7 0x000000000078b928 in mysql_parse (thd=thd@entry=0x351d0c0, rawbuf=rawbuf@entry=0x7f28ac005140 "START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */", length=length@entry=54,
    parser_state=parser_state@entry=0x7f28ffe3a7b0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_parse.cc:7292
#8 0x000000000078bfa6 in wsrep_mysql_parse (thd=thd@entry=0x351d0c0, rawbuf=0x7f28ac005140 "START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */", length=length@entry=54,
    parser_state=parser_state@entry=0x7f28ffe3a7b0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_parse.cc:7043
#9 0x000000000078d9f6 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x351d0c0, packet=packet@entry=0x3520ef1 "START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */",
    packet_length=packet_length@entry=54) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_parse.cc:1629
#10 0x000000000078e229 in do_command (thd=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_parse.cc:1133
#11 0x0000000000757d82 in do_handle_one_connection (thd_arg=thd_arg@entry=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_connect.cc:1557
#12 0x0000000000757ec0 in handle_one_connection (arg=arg@entry=0x351d0c0) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/sql/sql_connect.cc:1461
#13 0x0000000000c11b83 in pfs_spawn_thread (arg=0x343b390) at /media/Oort/ncode/percona-xtradb-cluster/pxc56/storage/perfschema/pfs.cc:1860
#14 0x00007f2966fae124 in start_thread () from /usr/lib/libpthread.so.0
#15 0x00007f29653d84bd in clone () from /usr/lib/libc.so.6

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

=== modified file 'sql/binlog.cc'
--- sql/binlog.cc 2014-06-12 18:12:00 +0000
+++ sql/binlog.cc 2014-08-10 19:46:59 +0000
@@ -1387,10 +1387,12 @@

   binlog_cache_mngr * const cache_mngr= thd_get_cache_mngr(thd);

+ if (!cache_mngr->is_binlog_empty()) {
   /* Server layer calls us with LOCK_log locked, so this is safe. */
- mysql_bin_log.raw_get_current_log(&cache_mngr->binlog_info);
+ mysql_bin_log.raw_get_current_log(&cache_mngr->binlog_info);

- trans_register_ha(thd, TRUE, hton);
+ trans_register_ha(thd, TRUE, hton);
+ }

   DBUG_RETURN(err);
 }

fixes it. Need more review from Alexey on this.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

This code from binlog_init() is the root cause:

#ifdef WITH_WSREP
  if (WSREP_ON)
    binlog_hton->state= SHOW_OPTION_YES;
  else
  {
#endif /* WITH_WSREP */
  binlog_hton->state=opt_bin_log ? SHOW_OPTION_YES : SHOW_OPTION_NO;
#ifdef WITH_WSREP
  }
#endif /* WITH_WSREP */

That is, unlike PS, PXC has binlog emulation, which makes start_snapshot_handlerton() think it can request a consistent snapshot from the binlog 'handlerton', but the binlog is not actually initialized.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

I don't think this bug affects codership-mysql, because the crash occurs in Percona Server specific code (enhancements to binlog API).

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :

Ok, thanks for clarification.

Changed in codership-mysql:
status: New → Invalid
Revision history for this message
Jay Janssen (jay-janssen) wrote :

What PXC version is this released as?

Revision history for this message
Alexey Kopytov (akopytov) wrote :

As the milestone says, it _will_ be released in 5.6.20-25.7.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

The fix is available in experimental repo for testing: http://jenkins.percona.com/experimental/CENTOS/percona-experimental.repo

Revision history for this message
Jai Gupta (jai-g) wrote :

Tried again and mysqldump with --single-transaction worked perfectly. Thank you.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Fix for this is also available in percona-testing repo now for both debian and centos. Refer to documentation for details.

Revision history for this message
Walter Heck (walterheck) wrote :

When can this be pushed to stable? We're hurt by this and are effectively dead in the water: policy restrictions don't allow for upgrading the whole galera cluster to a testing repo version, disabling mysqldump for the time being ruins backups and enabling binlog is a no-no performance wise for this particular node.

Please release asap :)

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

The release is slated for Monday, 2nd September, 2014.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1711

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.