Percona XtraDB Cluster - HA scalable solution for MySQL

Server deadlock while DDL

Reported by Raghavendra D Prabhu on 2013-08-16
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Undecided
Seppo Jaakola
5.6
Undecided
Teemu Ollakka
Percona XtraDB Cluster
Status tracked in Trunk
5.6
Undecided
Unassigned
Trunk
Undecided
Unassigned

Bug Description

 TOI-ased DDL causes server hang. I will attach the backtrace.

Related branches

Backtrace/gdb bt of codership 5.5.32

Backtrace for PXC 5.5.33

Produced from

perl runall-new.pl --basedir=/chip --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --threads=16 --queries=10000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --seed=time --mysqld=--wsrep_provider_options="gcache.size=512M" --mysqld=--innodb_flush_method=O_DIRECT

The yy and zz files attached.

Thread apply all bt full of codership 5.5.32

Log of the command line.

GDB backtrace of node1

GDB backtrace for node2.

I have added backtrace for nodes 1 and 2 - this hang happens on node1 when DDL is added to the grammar. Without it, it works fine.

runall log for previous gdb backtraces.

From the trace it looks like, few threads are in galera_pre_commit, 2-3 in MDL_context::acquire_lock and and 3 in galera::ReplicatorSMM::to_isolation_begin. Since this lasts for more than 600s, it looks like a deadlock among the threads.

summary: - Server hung due to DDL
+ Server deadlock while DDL

Grammar files.

Changed in percona-xtradb-cluster:
milestone: none → 5.5.33-23.7.6
Download full text (17.6 KiB)

Wrt. table1000_innodb_key_pk_parts_2_int_autoinc:

===================================================================

Thread 14 (Thread 0x7f0570404700 (LWP 29937)):
#0 0x00007f05832c003f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1 0x00007f0580dbeb25 in wait (this=0x7f05703ffc70, cond=...) at galerautils/src/gu_lock.hpp:56
#2 galera::Monitor<galera::ReplicatorSMM::CommitOrder>::enter (this=this@entry=0x201da08, obj=...) at galera/src/monitor.hpp:121
#3 0x00007f0580db978a in galera::ReplicatorSMM::to_isolation_begin (this=0x201cf90, trx=0x7f052c022ea0) at galera/src/replicator_smm.cpp:965
#4 0x00007f0580dcd074 in galera_to_execute_start (gh=<optimized out>, conn_id=14, key=<optimized out>, key_len=1, query=0x7f052c024980, query_len=133, global_seqno=0x28424e8) at galera/src/wsrep_provider.cpp:583
#5 0x000000000068370b in wsrep_TOI_begin (table_list=0x0, table_=0x7f052c004d10 "f", db_=0x7f052c005270 "test", thd=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/wsrep_mysqld.cc:1115
#6 wsrep_to_isolation_begin (thd=thd@entry=0x2840900, db_=0x7f052c005270 "test", table_=0x7f052c004d10 "f", table_list=table_list@entry=0x0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/wsrep_mysqld.cc:1250
#7 0x000000000060fe24 in mysql_create_like_table (thd=thd@entry=0x2840900, table=table@entry=0x7f052c004d48, src_table=src_table@entry=0x7f052c005300, create_info=create_info@entry=0x7f0570402960) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_table.cc:4631
#8 0x000000000059f999 in mysql_execute_command (thd=thd@entry=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:3068
#9 0x00000000005a6e41 in mysql_parse (thd=thd@entry=0x2840900, rawbuf=rawbuf@entry=0x7f052c004c10 "CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc`", length=length@entry=65, parser_state=parser_state@entry=0x7f05704035e0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:6465
#10 0x00000000005a7ab3 in wsrep_mysql_parse (thd=thd@entry=0x2840900, rawbuf=0x7f052c004c10 "CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc`", length=65, parser_state=parser_state@entry=0x7f05704035e0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:6272
#11 0x00000000005a8a89 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x2840900, packet=packet@entry=0x27f13f1 " CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc` ", packet_length=packet_length@entry=67) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:1254
#12 0x00000000005aa994 in do_command (thd=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:885
#13 0x000000000065834a in do_handle_one_connection (thd_arg=thd_arg@entry=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_connect.cc:1512
#14 0x000000000065857e in handle_one_connection (arg=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/s...

Backtrace attached.

node0 vardir with logs.

GDB backtrace.

Download full text (3.5 KiB)

With wsrep_debug it crashed as:
.........

130827 17:26:43 [Note] WSREP: BF Aborted, thd: 25 is_AC: 0, retry: 0 - 1 SQL: UPDATE IGNORE `table100_innodb_int_autoinc` AS X SET `col_char_12` = 'v' WHERE X . `col_char_12_key` < 195 ORDER BY `col_char_12`,`col
 _char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 5
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_key_pk_parts_2_int#P#p0.
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_key_pk_parts_2_int#P#p1.
 130827 17:26:43 [ERROR] WSREP: galera/src/fsm.hpp:shift_to():77: FSM: no such a transition EXECUTING -> COMMITTED
 11:56:43 UTC - mysqld got signal 6 ;
 This could be because you hit a bug. It is also possible that this binary
 or one of the libraries it was linked against is corrupt, improperly built,
 or misconfigured. This error can also be caused by malfunctioning hardware.
 We will try our best to scrape up some info that will hopefully help
 diagnose the problem, but since we have already crashed,
 something is definitely wrong and this may fail.
 Please help us make Percona Server better by reporting any
 bugs at http://bugs.percona.com/

 key_buffer_size=8388608
 read_buffer_size=131072
 max_used_connections=20
 max_threads=153
 thread_count=19
 connection_count=19
 It is possible that mysqld could use up to
 key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 343058 K bytes of memory
 Hope that's ok; if not, decrease some variables in the equation.

 Thread pointer: 0x2b9d730
 Attempting backtrace. You can use the following information to find out
 where mysqld died. If you see no messages after this, something went
 terribly wrong...
 stack_bottom = 7ff73c0e1e68 thread_stack 0x40000
 /pxc/bin/mysqld(my_print_stacktrace+0x2c)[0x811e1a]
 /pxc/bin/mysqld(handle_fatal_signal+0x30f)[0x6d1c6b]
 /usr/lib/libpthread.so.0(+0xf870)[0x7ff74ea46870]
 /usr/lib/libc.so.6(gsignal+0x39)[0x7ff74d1f33d9]
 /usr/lib/libc.so.6(abort+0x148)[0x7ff74d1f47d8]
 /pxc/lib/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x17c)[0x7ff74c54475c]
 /pxc/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11post_commitEPNS_9TrxHandleE+0xcd)[0x7ff74c53af1d]
 /pxc/lib/libgalera_smm.so(galera_post_commit+0x46)[0x7ff74c54e2f6]
 /pxc/bin/mysqld(_Z25wsrep_cleanup_transactionP3THD+0x6a)[0x67dc5a]
 /pxc/bin/mysqld[0x6d3bbc]
 /pxc/bin/mysqld(_Z15ha_commit_transP3THDb+0x6a4)[0x6d59f4]
 /pxc/bin/mysqld(_Z21trans_commit_implicitP3THD+0x99)[0x6671cd]
 /pxc/bin/mysqld(_Z21mysql_execute_commandP3THD+0x4f9)[0x59e741]
 /pxc/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20a)[0x5a6e41]
 /pxc/bin/mysqld[0x5a7ab3]
 /pxc/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xa62)[0x5a8a89]
 /pxc/bin/mysqld(_Z10do_commandP3THD+0x6c0)[0x5aa994]
 /pxc/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x143)[0x65834a]
 /pxc/bin/mysqld(handle_one_connection+0x49)[0x65857e]
 /usr/lib/libpthread.so.0(+0x80a2)[0x7ff74ea3f0a2]
 /usr/lib/libc.so.6(clone+0x6d)[0x7ff74d2a2a2d]

 Trying to get some variables.
 Some pointers may be invalid and cause the dump to abort.
 Query (7ff6d0004c10): DROP TABLE IF EXISTS b
 Connection ID (thread ID...

Read more...

GDB trace of when it is hung, mysql.err to follow.

mysql error log.

Download full text (11.8 KiB)

Attached in #17 and #18, the logs for

 perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --threads=16 --queries=1000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-debug=1 --mysqld=--wsrep_provider_options='debug=1'
# 2013-08-27T17:41:22 [575] Using Log::Log4perl
# 2013-08-27T17:41:22 [575] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-27T17:41:22 [575] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-27T17:41:22 [575] Starting
# 2013-08-27T17:41:22 [575] # runall-new.pl \
# 2013-08-27T17:41:22 [575] # --basedir=/pxc \
# 2013-08-27T17:41:22 [575] # --galera=ms \
# 2013-08-27T17:41:22 [575] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-27T17:41:22 [575] # --grammar=conf/galera/galera_stress.yy \
# 2013-08-27T17:41:22 [575] # --gendata=conf/galera/galera_stress.zz \
# 2013-08-27T17:41:22 [575] # --threads=16 \
# 2013-08-27T17:41:22 [575] # --queries=1000 \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-27T17:41:22 [575] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep-debug=1 \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep_provider_options=debug=1
# 2013-08-27T17:41:22 [575] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-27T17:41:23 [575] /media/Tintin/Work/code/randgen/pxc Revno: 897
# 2013-08-27T17:41:23 [575] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-27T17:41:23 [575] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-27T17:41:23 [575] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-27T17:41:31 [575] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-27T17:41:38 [575] MySQL Version:5.5.33
# 2013-08-27T17:41:38 [575] Binary: /pxc/bin/mysqld
# 2013-08-27T17:41:38 [575] Type: Debug
# 2013-08-27T17:41:38 [575] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-27T17:41:38 [575] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-27T17:41:38 [575] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-27T17:41:38 [575] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_address=127.0.0.1 --w...

vardir containing mysql error and general log of node0 and node1

Noticed

"
130827 20:31:21 [Note] [Debug] WSREP: galera/src/wsrep_provider.cpp:galera_post_rollback():323: trx 18446744073709551615 not found
"

in the logs.

Changed in codership-mysql:
assignee: nobody → Seppo Jaakola (seppo-jaakola)
status: New → In Progress
Seppo Jaakola (seppo-jaakola) wrote :

 The problem seems to happen after a COMMIT processing was interrupted (e.g. certification failure). When such commit processing returns, it does not always release transactional MDL locks. So some MDL locks will remain even for the next queries in the connection.

And later some DDL processor get blocked because of the MDL lock.

Seppo Jaakola (seppo-jaakola) wrote :

A potential fix for releasing the transactional MDL locks was pushed in revision: http://bazaar.launchpad.net/~codership/codership-mysql/5.5-23/revision/3907

I am still able to reproduce with ROW_TYPE_COMPRESSED or ROW_TYPE_DYNAMIC (in table formats). Attached the trace.

Download full text (16.1 KiB)

Log of

perl runall-new.pl --basedir=/pxc --galera=ms --threads=16 --queries=50000 9.26s user 8.95s system 5% cpu 5:10.36 total
Archie% perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=$HOME/galera_stress.yy --gendata=/tmp/galera_stress.zz --threads=16 --queries=50000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-debug=1 --mysqld=--wsrep_provider_options='debug=1'
# 2013-08-30T14:15:57 [1169] Using Log::Log4perl
# 2013-08-30T14:15:57 [1169] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-30T14:15:57 [1169] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-30T14:15:57 [1169] Starting
# 2013-08-30T14:15:57 [1169] # runall-new.pl \
# 2013-08-30T14:15:57 [1169] # --basedir=/pxc \
# 2013-08-30T14:15:57 [1169] # --galera=ms \
# 2013-08-30T14:15:57 [1169] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-30T14:15:57 [1169] # --grammar=/home/raghavendra/galera_stress.yy \
# 2013-08-30T14:15:57 [1169] # --gendata=/tmp/galera_stress.zz \
# 2013-08-30T14:15:57 [1169] # --threads=16 \
# 2013-08-30T14:15:57 [1169] # --queries=50000 \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-30T14:15:57 [1169] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep-debug=1 \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep_provider_options=debug=1
# 2013-08-30T14:15:57 [1169] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-30T14:15:57 [1169] /media/Tintin/Work/code/randgen/pxc Revno: 903
# 2013-08-30T14:15:57 [1169] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-30T14:15:57 [1169] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-30T14:15:57 [1169] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-30T14:16:04 [1169] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-30T14:16:10 [1169] MySQL Version:5.5.33
# 2013-08-30T14:16:10 [1169] Binary: /pxc/bin/mysqld
# 2013-08-30T14:16:10 [1169] Type: Debug
# 2013-08-30T14:16:10 [1169] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-30T14:16:10 [1169] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-30T14:16:10 [1169] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-30T14:16:10 [1169] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_l...

Download full text (26.9 KiB)

Transition crash while
==============================

perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=$HOME/galera_stress.yy --gendata=$HOME/galera_stress.zz --threads=16 --queries=50000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--lock_wait_timeout=50 --mem
# 2013-08-30T17:51:21 [48533] Using Log::Log4perl
# 2013-08-30T17:51:21 [48533] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-30T17:51:21 [48533] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-30T17:51:21 [48533] Starting
# 2013-08-30T17:51:21 [48533] # runall-new.pl \
# 2013-08-30T17:51:21 [48533] # --basedir=/pxc \
# 2013-08-30T17:51:21 [48533] # --galera=ms \
# 2013-08-30T17:51:21 [48533] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-30T17:51:21 [48533] # --grammar=/home/raghavendra/galera_stress.yy \
# 2013-08-30T17:51:21 [48533] # --gendata=/home/raghavendra/galera_stress.zz \
# 2013-08-30T17:51:21 [48533] # --threads=16 \
# 2013-08-30T17:51:21 [48533] # --queries=50000 \
# 2013-08-30T17:51:21 [48533] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-30T17:51:21 [48533] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-30T17:51:21 [48533] # --mysqld=--lock_wait_timeout=50 \
# 2013-08-30T17:51:21 [48533] # --mem
# 2013-08-30T17:51:21 [48533] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-30T17:51:21 [48533] /media/Tintin/Work/code/randgen/pxc Revno: 906
# 2013-08-30T17:51:21 [48533] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-30T17:51:21 [48533] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-30T17:51:21 [48533] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-30T17:51:28 [48533] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-30T17:51:45 [48533] MySQL Version:5.5.33
# 2013-08-30T17:51:45 [48533] Binary: /pxc/bin/mysqld
# 2013-08-30T17:51:45 [48533] Type: Debug
# 2013-08-30T17:51:45 [48533] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-30T17:51:45 [48533] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-30T17:51:45 [48533] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-30T17:51:45 [48533] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_addr...

Download full text (7.1 KiB)

Based on the logs, it looks like a DDL MDL conflict may be
causing this.

 130902 22:13:29 [Note] WSREP: BF kill (1, seqno: 5216), victim: (25) trx: 7836
 130902 22:13:29 [Note] WSREP: Aborting query: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: victim 7836 in state 7
 130902 22:13:29 [Note] WSREP: MDL conflict db=test table=r ticket=4 solved by abort
 130902 22:13:29 [Note] WSREP: MDL conflict
 request: (14 seqno 5216 wsrep (2, 1, 0) cmd 3 9 DROP TABLE IF EXISTS r)
 granted: (25 seqno 5220 wsrep (0, 1, 7) cmd 3 31 SET AUTOCOMMIT=ON)
 130902 22:13:29 [Note] WSREP: MDL ticket: type: shared write space: TABLE db: test name: r (Waiting for table metadata lock)
 130902 22:13:29 [Note] WSREP: DROP caused BF abort
 130902 22:13:29 [Note] WSREP: MDL ticket: type: shared write space: TABLE db: test name: r (Waiting for table metadata lock)
 130902 22:13:29 [Note] WSREP: wsrep_abort_thd, by: 140481475598080, victim: 140481472669440
 130902 22:13:29 [Note] WSREP: abort transaction: BF: DROP TABLE IF EXISTS r victim: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: cluster conflict due to high priority abort for threads:
 130902 22:13:29 [Note] WSREP: Winning thread:
    THD: 14, mode: total order, state: executing, conflict: no conflict, seqno: 5216
    SQL: DROP TABLE IF EXISTS r
 130902 22:13:29 [Note] WSREP: Victim thread:
    THD: 25, mode: local, state: executing, conflict: cert failure, seqno: 5220
    SQL: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: BF kill (1, seqno: 5216), victim: (25) trx: 7836
 130902 22:13:29 [Note] WSREP: Aborting query: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: victim 7836 in state 7
 130902 22:13:29 [Note] WSREP: MDL victim killed at add_ticket
 130902 22:13:29 [Note] WSREP: cert failure, thd: 25 is_AC: 0, retry: 0 - 1 SQL: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5210, 0 : ALTER TABLE n RENAME `table1_innodb_compressed_int_autoinc`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5211, 0 : DROP TABLE `table1_innodb_default_key_pk_parts_2_int_autoinc`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5213, 0 : CREATE TABLE n LIKE `table1_innodb_compressed_key_pk_parts_2_int`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5158, 0 : DROP TABLE IF EXISTS t
 130902 22:13:29 [Note] WSREP: empty rbr buffer, query: DROP TABLE IF EXISTS c
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5214, 0 : DROP TABLE IF EXISTS c
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_default_char.
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_default_char.
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5154, 0 : DROP TABLE IF EXISTS l
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Sort aborted: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Sort aborted: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [Note] WSREP: TO BEGIN: 5212, 0 : DROP TABLE IF EXISTS y
 130902 22:14:20 [Note] WSREP: TO BEGIN: 5215, 0 : DROP TABLE IF EXISTS h

 I...

Read more...

Following grammars:

a)
$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['COMPRESSED','DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)'],
        indexes => [undef, 'key' ]
};

$data = {
        numbers => [ 'digit', 'null', undef ],
        strings => [ 'letter', 'english', 'char(12)' ]
}
b)

$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)', 'varchar', 'date', 'bigint', 'enum' ],
        indexes => [undef, 'key' ],
        charsets => ['utf8','latin1']
};

$data = {
        numbers => [ 'digit', 'null', undef ],
 temporals => ['date', 'year', 'null', undef ],
        enum => [ 'letter' ],
        strings => [ 'letter', 'english', 'char(12)', 'varchar(246)', 'varchar(100)' ]
}

still hang the server (after latest fix).

If timeout is used, everything works fine (till now).

After latest commit, the combined grammar of

$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['COMPRESSED','DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)', 'varchar', 'date', 'bigint', 'enum' ],
        indexes => [undef, 'key' ],
        charsets => ['utf8','latin1']
};

$data = {
        numbers => [ 'digit', 'null', undef ],
 temporals => ['date', 'year', 'null', undef ],
        enum => [ 'letter' ],
        strings => [ 'letter', 'english', 'char(12)', 'varchar(246)', 'varchar(100)' ]
}

hangs.

Combination (b) doesn't

Combination (a) causes crash - reported here - lp:1123233

Seppo Jaakola (seppo-jaakola) wrote :

Pushed fixes for releasing MDL after failed BEGIN/START TRANSACTION and "SET AUTOCOMMIT". Fix is in revision: http://bazaar.launchpad.net/~codership/codership-mysql/5.5-23/revision/3912

No hangs after last fix. Related details also posted here https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1123233/comments/40

Changed in percona-xtradb-cluster:
status: New → In Progress
Download full text (35.9 KiB)

I re-added temporary table since https://bugs.launchpad.net/codership-mysql/+bug/1212247 has been fixed (since that was for temporary tables). Having temporary tables, produces the same deadlock

============================================================================================

Running with galera grammar (yy and zz attached).

# 2013-09-16T09:50:01 [20538] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-09-16T09:50:01 [20538] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-09-16T09:50:01 [20538] Starting
# 2013-09-16T09:50:01 [20538] # runall-new.pl \
# 2013-09-16T09:50:01 [20538] # --basedir=/mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64 \
# 2013-09-16T09:50:01 [20538] # --galera=ms \
# 2013-09-16T09:50:01 [20538] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-09-16T09:50:01 [20538] # --grammar=conf/galera/galera_stress.yy \
# 2013-09-16T09:50:01 [20538] # --gendata=conf/galera/galera_stress.zz \
# 2013-09-16T09:50:01 [20538] # --threads=16 \
# 2013-09-16T09:50:01 [20538] # --queries=50000 \
# 2013-09-16T09:50:01 [20538] # --mysqld=--wsrep-provider=/mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/lib/libgalera_smm.so \
# 2013-09-16T09:50:01 [20538] # --seed=time \
# 2013-09-16T09:50:01 [20538] # --mysqld=--innodb_flush_method=O_DIRECT
# 2013-09-16T09:50:01 [20538] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/pxcgen Revno: 915
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/pxcgen Revision-Id: <email address hidden>
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64 does not look like a bzr branch, cannot get revision info.
# 2013-09-16T09:50:02 [20538] Creating MySQL 5.5.33 database at /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node0/data
# 2013-09-16T09:50:08 [20538] Creating MySQL 5.5.33 database at /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node1/data
# 2013-09-16T09:50:20 [20538] MySQL Version:5.5.33
# 2013-09-16T09:50:20 [20538] Binary: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/bin/mysqld
# 2013-09-16T09:50:20 [20538] Type: Release
# 2013-09-16T09:50:20 [20538] Datadir: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node0/data
# 2013-09-16T09:50:20 [20538] Tmpdir: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-te...

Changed in percona-xtradb-cluster:
milestone: 5.5.33-23.7.6 → future-5.5

I see deadlock with 5.6 tree as well, so the fix will need to ported there.

Teemu Ollakka (teemu-ollakka) wrote :

Merged lp:~codership/codership-mysql/5.5-23 revs 3097, 3012 to lp:codership-mysql/5.6 (rev 3969).

See also https://bugs.launchpad.net/codership-mysql/+bug/1239571 which fixes deadlock caused by TOI DDL in 5.6 branch.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers