Percona Server moved to https://jira.percona.com/projects/PS

percona-server slave crashes when applying row-based binlog entries in cascading replication for Merge storage engine

Bug #1191263 reported by Mario Splivalo on 2013-06-15

This bug affects 2 people

	Status	Importance	Assigned to	Milestone
MySQL Server	Unknown	Unknown	mysql-bugs #69574
Percona Server moved to https://jira.percona.com/projects/PS	Status tracked in 5.7
5.1	Invalid	Undecided	Unassigned
5.5	Fix Released	Medium	Unassigned	Percona Server moved to https://jira.percona.com/projects/PS 5.5.50-38.0
5.6	Fix Released	Medium	Unassigned	Percona Server moved to https://jira.percona.com/projects/PS 5.6.31-77.0
5.7	Fix Released	Medium	Unassigned	Percona Server moved to https://jira.percona.com/projects/PS 5.7.13-6

Bug Description

In cascading replication setup (master -> master/slave -> slave), a slave server crashes when applying row-based binlog entries that change data in Merge engine tables. Note that master/slave server (in the setup described) won't crash, only the slave (the last one in chain) crashes.

Steps to reproduce:

- establish cascading replication between three servers: "master" -> "master/slave" -> "slave" (here, "master/slave" has both binlogs/relaylogs enabled).
- create database 'mergetest' on "master" (it will propagate across all the slaves)
- create two myisam tables, t_1 and t_2, and table t, a merge table:

create table t_1 (id int primary key auto_increment, data varchar(200)) ENGINE=MyISAM;
create table t_2 (id int primary key auto_increment, data varchar(200)) ENGINE=MyISAM;
create table t (id int primary key auto_increment, data varchar(200)) ENGINE=MERGE UNION=(t_1,t_2) INSERT_METHOD=FIRST;

- fill the table with some data:
insert into t (data) values ('robor'), ('tobor'), ('dobor'), ('borbor');

- force the row-based binlog entry (or make sure binlog_format is set to 'row'):
update t set data = 'changed' limit 1;

This update is propagated to the "master/slave" server with no issues, however, the "slave" server crashes when trying to apply it. Here is the excerpt from the slave's error log file:

------------
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fe541da99d0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x8bf70e]
/usr/sbin/mysqld(handle_fatal_signal+0x491)[0x699061]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf030)[0x7fe555d93030]
/usr/sbin/mysqld(_ZNK9table_def15compatible_withEP3THDP14Relay_log_infoP5TABLEPS5_+0xc7)[0x882d97]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x748)[0x86a618]
/usr/sbin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x67)[0x8633f7]
/usr/sbin/mysqld(_Z26apply_event_and_update_posPP9Log_eventP3THDP14Relay_log_info+0x1a6)[0x88e716]
/usr/sbin/mysqld(handle_slave_sql+0x9f5)[0x8910f5]
/usr/sbin/mysqld(pfs_spawn_thread+0x16f)[0x8f679f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7fe555d8ab50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe5544ada7d]
-------------

This is verified and reproducible on:
Debian Squeeze with percona-server-server-5.5.31 ( 5.5.31-rel30.3-520.squeeze)
Debian Wheezy with percona-server-server-5.5.31 (5.5.31-rel30.3-520.wheezy)
Debian Wheezy with percona-server-server-5.6.11 (5.6.11-rc60.3-375.wheezy)

It is not reproducible with percona-server-5.1 (5.1.69-rel14.7-572.wheezy or 5.1.68-rel14.6-551.squeeze)

This bug seems(!) like a duplicate of http://bugs.mysql.com/bug.php?id=47103 and/or duplicate of these bugs: https://bugs.launchpad.net/percona-server/+bug/1019125, https://bugs.launchpad.net/percona-server/+bug/873044. The only difference is that all those bugs mention only master/slave setup - here the issue is in the 'second' slave in the cascading replication setup.

Attached file are my.cnf from all three servers (server_id and report_host is the only difference between the three files, and slav-only server doesn't have binlogs enabled) as well as error.log file from the slave (the last in the chain).

Cascading setup is used quite often in master-master setup scenarios where each master have several read-only slaves attached to it.

See original description

Tags:

Revision history for this message

Mario Splivalo (mariosplivalo) wrote on 2013-06-15:

wheezy3.err Edit (13.6 KiB, text/plain)

description:

updated

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2013-06-25:

I was able to reproduce this as described with PS 5.6.11 on a sandbox circular replication setup created like this:

make_replication_sandbox --circular=4 /home/openxs/builds/Percona-Server-5.6.11-rc60.3-387-valgrind.Linux.i686.tar.gz

This is what I see in the error log on node3:

...
2013-06-25 16:10:39 5836 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000001' at position 10825, relay log './mysql_sandbox15103-relay-bin.000002' position: 8443
13:10:39 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68426 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xab74cd0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = ffffffffa5951ef4 thread_stack 0x30000
/home/openxs/builds/5.6.11/bin/mysqld(my_print_stacktrace+0x33)[0x855e3b3]
/home/openxs/builds/5.6.11/bin/mysqld(handle_fatal_signal+0x4aa)[0x830040a]
[0xb77ac400]
/home/openxs/builds/5.6.11/bin/mysqld[0x851ef6d]
/home/openxs/builds/5.6.11/bin/mysqld(_ZNK9table_def15compatible_withEP3THDP14Relay_log_infoP5TABLEPS5_+0xc2)[0x8520432]
/home/openxs/builds/5.6.11/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x765)[0x8506de5]
/home/openxs/builds/5.6.11/bin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x57)[0x85018e7]
/home/openxs/builds/5.6.11/bin/mysqld(_Z26apply_event_and_update_posPP9Log_eventP3THDP14Relay_log_info+0x1b7)[0x852b907]
/home/openxs/builds/5.6.11/bin/mysqld[0x8531740]
/home/openxs/builds/5.6.11/bin/mysqld(handle_slave_sql+0x726)[0x8535346]
/home/openxs/builds/5.6.11/bin/mysqld(pfs_spawn_thread+0x209)[0x85ebb19]
/lib/i386-linux-gnu/i686/cmov/libpthread.so.0(+0x5c39)[0xb778dc39]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(clone+0x5e)[0xb737078e]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 4
Status: NOT_KILLED
...

I was able to reproduce this as described with PS 5.6.11 on a sandbox circular replication setup created like this:

make_replication_sandbox --circular=4 /home/openxs/builds/Percona-Server-5.6.11-rc60.3-387-valgrind.Linux.i686.tar.gz

This is what I see in the error log on node3:

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68426 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 4
Status: NOT_KILLED
...

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2013-06-25:

Same with PS 5.5.31:

...
130625 18:07:17 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000001' at position 6746, relay log './mysql_sandbox17103-relay-bin.000002' position: 4843
15:07:17 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 342315 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x95c4978
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = ffffffffa5963384 thread_stack 0x30000
/home/openxs/5.5.31/bin/mysqld(my_print_stacktrace+0x33)[0x843b473]
/home/openxs/5.5.31/bin/mysqld(handle_fatal_signal+0x43e)[0x82e51fe]
[0xb7717400]
/home/openxs/5.5.31/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x1098)[0x83b0428]
/home/openxs/5.5.31/bin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x193)[0x813f063]
/home/openxs/5.5.31/bin/mysqld[0x8147bd7]
/home/openxs/5.5.31/bin/mysqld(handle_slave_sql+0xea3)[0x8149293]
/lib/i386-linux-gnu/i686/cmov/libpthread.so.0(+0x5c39)[0xb76f8c39]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(clone+0x5e)[0xb73e478e]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 10
Status: NOT_KILLED

Same with PS 5.5.31:

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 342315 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 10
Status: NOT_KILLED

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2013-06-25:

It seems problem comes from upstream MySQL server, check http://bugs.mysql.com/bug.php?id=69574.

Laurynas Biveinis (laurynas-biveinis) on 2013-06-26

tags:

added: upstream

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1385

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

mysql-bugs #69574 Edit

Bug watches keep track of this bug in other bug trackers.