Percona Server with XtraDB

Slave server crash after an update statement

Reported by Miguel Angel Nieto on 2012-09-20
48
This bug affects 7 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server
High
George Ormond Lorch III
5.5
High
George Ormond Lorch III

Bug Description

The latest 5.5.27-28.1 Percona Server crashes when the following list of things happens:

- execute a update statement using a variable
- slave server has replicate-wild-ignore-table or replicate-ignore-table on my.cnf

MASTER: 5.5.24-rel26.0-log

(default sandbox my.cnf)

[mysql]
prompt='mysql [\h] {\u} (\d) > '
#

[client]
user = msandbox
password = msandbox
port = 5524
socket = /tmp/mysql_sandbox5524.sock

[mysqld]
user = root
port = 5524
socket = /tmp/mysql_sandbox5524.sock
basedir = /root/5.5.24
datadir = /root/sandboxes/msb_5_5_24/data
tmpdir = /root/sandboxes/msb_5_5_24/tmp
pid-file = /root/sandboxes/msb_5_5_24/data/mysql_sandbox5524.pid
#log-slow-queries = /root/sandboxes/msb_5_5_24/data/msandbox-slow.log
#log = /root/sandboxes/msb_5_5_24/data/msandbox.log
#
# additional options passed through 'my_clause'
#
log-error=msandbox.err
server_id=1
log-bin=a

SLAVE: 5.5.27-28.1 Percona Server

root@debian:/var/lib/mysql# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
innodb_file_per_table=1
innodb_file_format=barracuda
server_id=2
replicate-wild-ignore-table=test.%
replicate-ignore-table=mysql.user

master> create database mytest;
master> use mytest;
master> create table t (i int, a char(10), b char(11));
master> insert into t values (1,'xxx','yyy');
master> update mytest.t set a='xxx',b='yyy',i=@a:=@a+1 where i=1;

then, restart the MASTER:

master> use mytest;
master> update mytest.t set i=1;
master> update mytest.t set a='xxx',b='yyy',i=@a:=@a+1 where i=1;

then, on the SLAVE:

slave> stop slave;
slave> start slave;
slave> show slave status\G
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
ERROR:
Can't connect to the server

1:04:07 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338618 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2b04580
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f9e92a59ea8 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7c6395]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6967e4]
/lib/libpthread.so.0(+0xeff0)[0x7f9ea2a33ff0]
/lib/libc.so.6(+0x10dd3a)[0x7f9ea1d0ad3a]
/usr/sbin/mysqld(_ZN5TABLE4initEP3THDP10TABLE_LIST+0x4c)[0x6111cc]
/usr/sbin/mysqld(_Z10open_tableP3THDP10TABLE_LISTP11st_mem_rootP18Open_table_context+0x148)[0x55eba8]
/usr/sbin/mysqld(_Z11open_tablesP3THDPP10TABLE_LISTPjjP19Prelocking_strategy+0x4f6)[0x560436]
/usr/sbin/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesbPySB_+0xa9)[0x6087a9]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x13e4)[0x5970d4]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x333)[0x59a853]
/usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_infoPKcj+0xd04)[0x745144]
/usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x125)[0x533815]
/usr/sbin/mysqld[0x538359]
/usr/sbin/mysqld(handle_slave_sql+0x949)[0x539799]
/lib/libpthread.so.0(+0x68ca)[0x7f9ea2a2b8ca]
/lib/libc.so.6(clone+0x6d)[0x7f9ea1ccc92d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (2b436a0): ?6?
Connection ID (thread ID): 1
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.

tags: added: i25112
tags: added: replication
summary: - Server crash after an update statement
+ Slave server crash after an update statement

I've reinstalled everything again and now following the same procedure it doesn't crash, seems that something is messing.

description: updated

Finally, I've found it. Some restarts are required to trigger the bug. Description updated.

description: updated

The workaround is to disable the following parameters:

replicate-wild-ignore-table=test.%
replicate-ignore-table=mysql.user

and then mysql starts to work again.

Submitted a bug report to Oracle:

http://bugs.mysql.com/bug.php?id=66890

The slave binary is from 64-bit squeeze deb.

The top of the stacktrace refers to

  if (strcmp(alias, tl->alias))

at line 3337 in TABLE::init(THD *, TABLE_LIST *)

MTR test case:

[openxs@chief mysql-test]$ cat t/bug65831.test source include/master-slave.inc;
--echo **** Master ****
connection master;
create database mytest;
use mytest;
CREATE TABLE t (a int, b int, c int);
INSERT INTO t VALUES(1, 2, 3);
UPDATE mytest.t set b=1, c=1, a=@var:=@var+1 where a=1;
SELECT * FROM t;
UPDATE t set a=1;
UPDATE mytest.t set b=1, c=1, a=@var:=@var+1 where a=1;
--echo **** Slave ****
# sync_slave_with_master;
connection slave;
stop slave;
start slave;
show slave status;

# Cleanup
connection master;
DROP TABLE t;

sync_slave_with_master;

Tom McCann (09-tom-an) wrote :

Not sure if the crash I'm getting is the same. We do have a couple replicate-ignore-table entries in our my.cnf. This happens on 5.5.27-rel29.0-315.lucid and the latest 5.1. We downgraded to 5.5_5.5.24-rel26.0-256.lucid as a work around. Here's what was in our error log:

22:21:16 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=536870912
read_buffer_size=131072
max_used_connections=2
max_threads=500
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1618416 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xb0cfa90
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f6a102d8838 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7d3545]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x69f394]
/lib/libpthread.so.0(+0xf8f0)[0x7f6b4740b8f0]
/usr/sbin/mysqld(_ZN11Query_arena10free_itemsEv+0x2d)[0x56ae7d]
/usr/sbin/mysqld(_ZN3THD19cleanup_after_queryEv+0x97)[0x56dc57]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x2c8)[0x59d5d8]
/usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_infoPKcj+0xd34)[0x7504b4]
/usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x12d)[0x534abd]
/usr/sbin/mysqld[0x539152]
/usr/sbin/mysqld(handle_slave_sql+0x9a5)[0x53a605]
/lib/libpthread.so.0(+0x69ca)[0x7f6b474029ca]
/lib/libc.so.6(clone+0x6d)[0x7f6b4669516d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (b0c7df0): `??

Connection ID (thread ID): 27
Status: NOT_KILLED

Roel Van de Paar (roel11) wrote :

For clarity, the original stacktrace cleaned up by running it through c++filt:

/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7c6395]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6967e4]
/lib/libpthread.so.0(+0xeff0)[0x7f9ea2a33ff0]
/lib/libc.so.6(+0x10dd3a)[0x7f9ea1d0ad3a]
/usr/sbin/mysqld(TABLE::init(THD*, TABLE_LIST*)+0x4c)[0x6111cc]
/usr/sbin/mysqld(open_table(THD*, TABLE_LIST*, st_mem_root*, Open_table_context*)+0x148)[0x55eba8]
/usr/sbin/mysqld(open_tables(THD*, TABLE_LIST**, unsigned int*, unsigned int, Prelocking_strategy*)+0x4f6)[0x560436]
/usr/sbin/mysqld(mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long
 long, enum_duplicates, bool, unsigned long long*, unsigned long long*)+0xa9)[0x6087a9] /usr/sbin/mysqld(mysql_execute_c
ommand(THD*)+0x13e4)[0x5970d4]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x333)[0x59a853]
/usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int)+0xd04)[0x745144]
/usr/sbin/mysqld(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*)+0x125)[0x533815]
/usr/sbin/mysqld[0x538359]
/usr/sbin/mysqld(handle_slave_sql+0x949)[0x539799]
/lib/libpthread.so.0(+0x68ca)[0x7f9ea2a2b8ca]
/lib/libc.so.6(clone+0x6d)[0x7f9ea1ccc92d]

And, the stacktrace from #7 cleaned up:

/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7d3545]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x69f394]
/lib/libpthread.so.0(+0xf8f0)[0x7f6b4740b8f0]
/usr/sbin/mysqld(Query_arena::free_items()+0x2d)[0x56ae7d]
/usr/sbin/mysqld(THD::cleanup_after_query()+0x97)[0x56dc57]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x2c8)[0x59d5d8]
/usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int)+0xd34)[0x7504b4]
/usr/sbin/mysqld(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*)+0x12d)[0x534abd]
/usr/sbin/mysqld[0x539152]
/usr/sbin/mysqld(handle_slave_sql+0x9a5)[0x53a605]
/lib/libpthread.so.0(+0x69ca)[0x7f6b474029ca]
/lib/libc.so.6(clone+0x6d)[0x7f6b4669516d]

The second stack trace is a different bug, which may or may not be related to bug #1042946 (it's a different stack, but it's reported as a Valgrind issue for the moment, so the crashing stack may be different).

Tom, if you are able to create a testcase, or provide us with a core dump + mysqld binary, it would help tremendously. In any case, a separate bug report should be logged for this.

  • mysqlcrash.txt Edit (28.2 KiB, text/plain; x-mac-type=54455854; x-mac-creator=21526368; x-unix-mode=0644; name="mysqlcrash.txt")
Download full text (7.8 KiB)

Not sure if this helps, but this attached is a copy of mysqld crashing when I was setting up a new slave.

On Nov 28, 2012, at 9:31 PM, RoelV wrote:

> For clarity, the original stacktrace cleaned up by running it through
> c++filt:
>
> /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7c6395]
> /usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6967e4]
> /lib/libpthread.so.0(+0xeff0)[0x7f9ea2a33ff0]
> /lib/libc.so.6(+0x10dd3a)[0x7f9ea1d0ad3a]
> /usr/sbin/mysqld(TABLE::init(THD*, TABLE_LIST*)+0x4c)[0x6111cc]
> /usr/sbin/mysqld(open_table(THD*, TABLE_LIST*, st_mem_root*, Open_table_context*)+0x148)[0x55eba8]
> /usr/sbin/mysqld(open_tables(THD*, TABLE_LIST**, unsigned int*, unsigned int, Prelocking_strategy*)+0x4f6)[0x560436]
> /usr/sbin/mysqld(mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long
> long, enum_duplicates, bool, unsigned long long*, unsigned long long*)+0xa9)[0x6087a9] /usr/sbin/mysqld(mysql_execute_c
> ommand(THD*)+0x13e4)[0x5970d4]
> /usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x333)[0x59a853]
> /usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int)+0xd04)[0x745144]
> /usr/sbin/mysqld(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*)+0x125)[0x533815]
> /usr/sbin/mysqld[0x538359]
> /usr/sbin/mysqld(handle_slave_sql+0x949)[0x539799]
> /lib/libpthread.so.0(+0x68ca)[0x7f9ea2a2b8ca]
> /lib/libc.so.6(clone+0x6d)[0x7f9ea1ccc92d]
>
> And, the stacktrace from #7 cleaned up:
>
> /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7d3545]
> /usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x69f394]
> /lib/libpthread.so.0(+0xf8f0)[0x7f6b4740b8f0]
> /usr/sbin/mysqld(Query_arena::free_items()+0x2d)[0x56ae7d]
> /usr/sbin/mysqld(THD::cleanup_after_query()+0x97)[0x56dc57]
> /usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x2c8)[0x59d5d8]
> /usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int)+0xd34)[0x7504b4]
> /usr/sbin/mysqld(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*)+0x12d)[0x534abd]
> /usr/sbin/mysqld[0x539152]
> /usr/sbin/mysqld(handle_slave_sql+0x9a5)[0x53a605]
> /lib/libpthread.so.0(+0x69ca)[0x7f6b474029ca]
> /lib/libc.so.6(clone+0x6d)[0x7f6b4669516d]
>
> The second stack trace is a different bug, which may or may not be
> related to bug #1042946 (it's a different stack, but it's reported as a
> Valgrind issue for the moment, so the crashing stack may be different).
>
> Tom, if you are able to create a testcase, or provide us with a core
> dump + mysqld binary, it would help tremendously. In any case, a
> separate bug report should be logged for this.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1053342
>
> Title:
> Slave server crash after an update statement
>
> Status in MySQL Server:
> Unknown
> Status in Percona Server with XtraDB:
> Fix Released
> Status in Percona Server 5.5 series:
> Fix Released
>
> Bug description:
> The latest 5.5.27-28.1 Percona Server crashes when the following list
> of things happens:
>
> - execute a update statemen...

Read more...

Roel Van de Paar (roel11) wrote :

Tom; very interesting. Is there any possibility for us to get a copy of MASTER_LOG_FILE='mysql-bin.000078' and maybe 1 or 2 subsequent files? Was there any date on the slave already (copied?)? Lastly, was a core file generated? How large is it when compressed + mysqld included? Could we obtain the same?

Tom McCann (09-tom-an) wrote :
Download full text (6.2 KiB)

Attached is a zip with 3 binlogs starting from 78. If there was a core dump where would it be (on an Ubuntu 10.04 EC2 system)? The process I'm using to bring up new slaves may be flawed and is what's causing this. I start a new EC2 instance from an AMI I made a few months ago (has existing mysql data on it). I bring over a xtrabackup from the master and restore. When I start the slave after the restore (change master/start slave), it crashes almost immediately.

I've since downgraded this system to 5.5.24-55 but I can bring up another instance and give you access to it if that helps.

On Dec 5, 2012, at 2:57 PM, RoelV wrote:

> Tom; very interesting. Is there any possibility for us to get a copy of
> MASTER_LOG_FILE='mysql-bin.000078' and maybe 1 or 2 subsequent files?
> Was there any date on the slave already (copied?)? Lastly, was a core
> file generated? How large is it when compressed + mysqld included? Could
> we obtain the same?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1053342
>
> Title:
> Slave server crash after an update statement
>
> Status in MySQL Server:
> Unknown
> Status in Percona Server with XtraDB:
> Fix Released
> Status in Percona Server 5.5 series:
> Fix Released
>
> Bug description:
> The latest 5.5.27-28.1 Percona Server crashes when the following list
> of things happens:
>
> - execute a update statement using a variable
> - slave server has replicate-wild-ignore-table or replicate-ignore-table on my.cnf
>
> MASTER: 5.5.24-rel26.0-log
>
> (default sandbox my.cnf)
>
> [mysql]
> prompt='mysql [\h] {\u} (\d) > '
> #
>
> [client]
> user = msandbox
> password = msandbox
> port = 5524
> socket = /tmp/mysql_sandbox5524.sock
>
> [mysqld]
> user = root
> port = 5524
> socket = /tmp/mysql_sandbox5524.sock
> basedir = /root/5.5.24
> datadir = /root/sandboxes/msb_5_5_24/data
> tmpdir = /root/sandboxes/msb_5_5_24/tmp
> pid-file = /root/sandboxes/msb_5_5_24/data/mysql_sandbox5524.pid
> #log-slow-queries = /root/sandboxes/msb_5_5_24/data/msandbox-slow.log
> #log = /root/sandboxes/msb_5_5_24/data/msandbox.log
> #
> # additional options passed through 'my_clause'
> #
> log-error=msandbox.err
> server_id=1
> log-bin=a
>
> SLAVE: 5.5.27-28.1 Percona Server
>
> root@debian:/var/lib/mysql# cat /etc/my.cnf
> [mysqld]
> datadir=/var/lib/mysql
> innodb_file_per_table=1
> innodb_file_format=barracuda
> server_id=2
> replicate-wild-ignore-table=test.%
> replicate-ignore-table=mysql.user
>
> master> create database mytest;
> master> use mytest;
> master> create table t (i int, a char(10), b char(11));
> master> insert into t values (1,'xxx','yyy');
> master> update mytest.t set a='xxx',b='yyy',i=@a:=@a+1 where i=1;
>
> then, restart the MASTER:
>
> master> use mytest;
> master> update mytest.t set i=1;
> master> update mytest.t set a='xx...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.