Server deadlock while DDL

Bug #1212955 reported by Raghavendra D Prabhu
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Fix Released
Undecided
Seppo Jaakola
5.6
Fix Released
Undecided
Teemu Ollakka
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Fix Released
Undecided
Unassigned
5.6
Fix Released
Undecided
Unassigned

Bug Description

 TOI-ased DDL causes server hang. I will attach the backtrace.

Related branches

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Backtrace/gdb bt of codership 5.5.32

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Backtrace for PXC 5.5.33

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Produced from

perl runall-new.pl --basedir=/chip --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --threads=16 --queries=10000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --seed=time --mysqld=--wsrep_provider_options="gcache.size=512M" --mysqld=--innodb_flush_method=O_DIRECT

The yy and zz files attached.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Thread apply all bt full of codership 5.5.32

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Log of the command line.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

GDB backtrace of node1

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

GDB backtrace for node2.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

I have added backtrace for nodes 1 and 2 - this hang happens on node1 when DDL is added to the grammar. Without it, it works fine.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

runall log for previous gdb backtraces.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

From the trace it looks like, few threads are in galera_pre_commit, 2-3 in MDL_context::acquire_lock and and 3 in galera::ReplicatorSMM::to_isolation_begin. Since this lasts for more than 600s, it looks like a deadlock among the threads.

summary: - Server hung due to DDL
+ Server deadlock while DDL
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Grammar files.

Changed in percona-xtradb-cluster:
milestone: none → 5.5.33-23.7.6
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (17.6 KiB)

Wrt. table1000_innodb_key_pk_parts_2_int_autoinc:

===================================================================

Thread 14 (Thread 0x7f0570404700 (LWP 29937)):
#0 0x00007f05832c003f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1 0x00007f0580dbeb25 in wait (this=0x7f05703ffc70, cond=...) at galerautils/src/gu_lock.hpp:56
#2 galera::Monitor<galera::ReplicatorSMM::CommitOrder>::enter (this=this@entry=0x201da08, obj=...) at galera/src/monitor.hpp:121
#3 0x00007f0580db978a in galera::ReplicatorSMM::to_isolation_begin (this=0x201cf90, trx=0x7f052c022ea0) at galera/src/replicator_smm.cpp:965
#4 0x00007f0580dcd074 in galera_to_execute_start (gh=<optimized out>, conn_id=14, key=<optimized out>, key_len=1, query=0x7f052c024980, query_len=133, global_seqno=0x28424e8) at galera/src/wsrep_provider.cpp:583
#5 0x000000000068370b in wsrep_TOI_begin (table_list=0x0, table_=0x7f052c004d10 "f", db_=0x7f052c005270 "test", thd=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/wsrep_mysqld.cc:1115
#6 wsrep_to_isolation_begin (thd=thd@entry=0x2840900, db_=0x7f052c005270 "test", table_=0x7f052c004d10 "f", table_list=table_list@entry=0x0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/wsrep_mysqld.cc:1250
#7 0x000000000060fe24 in mysql_create_like_table (thd=thd@entry=0x2840900, table=table@entry=0x7f052c004d48, src_table=src_table@entry=0x7f052c005300, create_info=create_info@entry=0x7f0570402960) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_table.cc:4631
#8 0x000000000059f999 in mysql_execute_command (thd=thd@entry=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:3068
#9 0x00000000005a6e41 in mysql_parse (thd=thd@entry=0x2840900, rawbuf=rawbuf@entry=0x7f052c004c10 "CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc`", length=length@entry=65, parser_state=parser_state@entry=0x7f05704035e0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:6465
#10 0x00000000005a7ab3 in wsrep_mysql_parse (thd=thd@entry=0x2840900, rawbuf=0x7f052c004c10 "CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc`", length=65, parser_state=parser_state@entry=0x7f05704035e0) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:6272
#11 0x00000000005a8a89 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x2840900, packet=packet@entry=0x27f13f1 " CREATE TABLE f LIKE `table1000_innodb_key_pk_parts_2_int_autoinc` ", packet_length=packet_length@entry=67) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:1254
#12 0x00000000005aa994 in do_command (thd=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_parse.cc:885
#13 0x000000000065834a in do_handle_one_connection (thd_arg=thd_arg@entry=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/sql/sql_connect.cc:1512
#14 0x000000000065857e in handle_one_connection (arg=0x2840900) at /media/Tintin/Work/code/percona-xtradb-cluster/trunk/Percona-Server/s...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Backtrace attached.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

node0 vardir with logs.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

GDB backtrace.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (3.5 KiB)

With wsrep_debug it crashed as:
.........

130827 17:26:43 [Note] WSREP: BF Aborted, thd: 25 is_AC: 0, retry: 0 - 1 SQL: UPDATE IGNORE `table100_innodb_int_autoinc` AS X SET `col_char_12` = 'v' WHERE X . `col_char_12_key` < 195 ORDER BY `col_char_12`,`col
 _char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 5
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_key_pk_parts_2_int#P#p0.
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_key_pk_parts_2_int#P#p1.
 130827 17:26:43 [ERROR] WSREP: galera/src/fsm.hpp:shift_to():77: FSM: no such a transition EXECUTING -> COMMITTED
 11:56:43 UTC - mysqld got signal 6 ;
 This could be because you hit a bug. It is also possible that this binary
 or one of the libraries it was linked against is corrupt, improperly built,
 or misconfigured. This error can also be caused by malfunctioning hardware.
 We will try our best to scrape up some info that will hopefully help
 diagnose the problem, but since we have already crashed,
 something is definitely wrong and this may fail.
 Please help us make Percona Server better by reporting any
 bugs at http://bugs.percona.com/

 key_buffer_size=8388608
 read_buffer_size=131072
 max_used_connections=20
 max_threads=153
 thread_count=19
 connection_count=19
 It is possible that mysqld could use up to
 key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 343058 K bytes of memory
 Hope that's ok; if not, decrease some variables in the equation.

 Thread pointer: 0x2b9d730
 Attempting backtrace. You can use the following information to find out
 where mysqld died. If you see no messages after this, something went
 terribly wrong...
 stack_bottom = 7ff73c0e1e68 thread_stack 0x40000
 /pxc/bin/mysqld(my_print_stacktrace+0x2c)[0x811e1a]
 /pxc/bin/mysqld(handle_fatal_signal+0x30f)[0x6d1c6b]
 /usr/lib/libpthread.so.0(+0xf870)[0x7ff74ea46870]
 /usr/lib/libc.so.6(gsignal+0x39)[0x7ff74d1f33d9]
 /usr/lib/libc.so.6(abort+0x148)[0x7ff74d1f47d8]
 /pxc/lib/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x17c)[0x7ff74c54475c]
 /pxc/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM11post_commitEPNS_9TrxHandleE+0xcd)[0x7ff74c53af1d]
 /pxc/lib/libgalera_smm.so(galera_post_commit+0x46)[0x7ff74c54e2f6]
 /pxc/bin/mysqld(_Z25wsrep_cleanup_transactionP3THD+0x6a)[0x67dc5a]
 /pxc/bin/mysqld[0x6d3bbc]
 /pxc/bin/mysqld(_Z15ha_commit_transP3THDb+0x6a4)[0x6d59f4]
 /pxc/bin/mysqld(_Z21trans_commit_implicitP3THD+0x99)[0x6671cd]
 /pxc/bin/mysqld(_Z21mysql_execute_commandP3THD+0x4f9)[0x59e741]
 /pxc/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20a)[0x5a6e41]
 /pxc/bin/mysqld[0x5a7ab3]
 /pxc/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xa62)[0x5a8a89]
 /pxc/bin/mysqld(_Z10do_commandP3THD+0x6c0)[0x5aa994]
 /pxc/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x143)[0x65834a]
 /pxc/bin/mysqld(handle_one_connection+0x49)[0x65857e]
 /usr/lib/libpthread.so.0(+0x80a2)[0x7ff74ea3f0a2]
 /usr/lib/libc.so.6(clone+0x6d)[0x7ff74d2a2a2d]

 Trying to get some variables.
 Some pointers may be invalid and cause the dump to abort.
 Query (7ff6d0004c10): DROP TABLE IF EXISTS b
 Connection ID (thread ID...

Read more...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

GDB trace of when it is hung, mysql.err to follow.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

mysql error log.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (11.8 KiB)

Attached in #17 and #18, the logs for

 perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --threads=16 --queries=1000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-debug=1 --mysqld=--wsrep_provider_options='debug=1'
# 2013-08-27T17:41:22 [575] Using Log::Log4perl
# 2013-08-27T17:41:22 [575] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-27T17:41:22 [575] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-27T17:41:22 [575] Starting
# 2013-08-27T17:41:22 [575] # runall-new.pl \
# 2013-08-27T17:41:22 [575] # --basedir=/pxc \
# 2013-08-27T17:41:22 [575] # --galera=ms \
# 2013-08-27T17:41:22 [575] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-27T17:41:22 [575] # --grammar=conf/galera/galera_stress.yy \
# 2013-08-27T17:41:22 [575] # --gendata=conf/galera/galera_stress.zz \
# 2013-08-27T17:41:22 [575] # --threads=16 \
# 2013-08-27T17:41:22 [575] # --queries=1000 \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-27T17:41:22 [575] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep-debug=1 \
# 2013-08-27T17:41:22 [575] # --mysqld=--wsrep_provider_options=debug=1
# 2013-08-27T17:41:22 [575] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-27T17:41:23 [575] /media/Tintin/Work/code/randgen/pxc Revno: 897
# 2013-08-27T17:41:23 [575] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-27T17:41:23 [575] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-27T17:41:23 [575] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-27T17:41:31 [575] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-27T17:41:38 [575] MySQL Version:5.5.33
# 2013-08-27T17:41:38 [575] Binary: /pxc/bin/mysqld
# 2013-08-27T17:41:38 [575] Type: Debug
# 2013-08-27T17:41:38 [575] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-27T17:41:38 [575] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-27T17:41:38 [575] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-27T17:41:38 [575] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_address=127.0.0.1 --w...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

vardir containing mysql error and general log of node0 and node1

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Noticed

"
130827 20:31:21 [Note] [Debug] WSREP: galera/src/wsrep_provider.cpp:galera_post_rollback():323: trx 18446744073709551615 not found
"

in the logs.

Changed in codership-mysql:
assignee: nobody → Seppo Jaakola (seppo-jaakola)
status: New → In Progress
Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

 The problem seems to happen after a COMMIT processing was interrupted (e.g. certification failure). When such commit processing returns, it does not always release transactional MDL locks. So some MDL locks will remain even for the next queries in the connection.

And later some DDL processor get blocked because of the MDL lock.

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

A potential fix for releasing the transactional MDL locks was pushed in revision: http://bazaar.launchpad.net/~codership/codership-mysql/5.5-23/revision/3907

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

I am still able to reproduce with ROW_TYPE_COMPRESSED or ROW_TYPE_DYNAMIC (in table formats). Attached the trace.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (16.1 KiB)

Log of

perl runall-new.pl --basedir=/pxc --galera=ms --threads=16 --queries=50000 9.26s user 8.95s system 5% cpu 5:10.36 total
Archie% perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=$HOME/galera_stress.yy --gendata=/tmp/galera_stress.zz --threads=16 --queries=50000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-debug=1 --mysqld=--wsrep_provider_options='debug=1'
# 2013-08-30T14:15:57 [1169] Using Log::Log4perl
# 2013-08-30T14:15:57 [1169] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-30T14:15:57 [1169] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-30T14:15:57 [1169] Starting
# 2013-08-30T14:15:57 [1169] # runall-new.pl \
# 2013-08-30T14:15:57 [1169] # --basedir=/pxc \
# 2013-08-30T14:15:57 [1169] # --galera=ms \
# 2013-08-30T14:15:57 [1169] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-30T14:15:57 [1169] # --grammar=/home/raghavendra/galera_stress.yy \
# 2013-08-30T14:15:57 [1169] # --gendata=/tmp/galera_stress.zz \
# 2013-08-30T14:15:57 [1169] # --threads=16 \
# 2013-08-30T14:15:57 [1169] # --queries=50000 \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-30T14:15:57 [1169] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep-debug=1 \
# 2013-08-30T14:15:57 [1169] # --mysqld=--wsrep_provider_options=debug=1
# 2013-08-30T14:15:57 [1169] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-30T14:15:57 [1169] /media/Tintin/Work/code/randgen/pxc Revno: 903
# 2013-08-30T14:15:57 [1169] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-30T14:15:57 [1169] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-30T14:15:57 [1169] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-30T14:16:04 [1169] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-30T14:16:10 [1169] MySQL Version:5.5.33
# 2013-08-30T14:16:10 [1169] Binary: /pxc/bin/mysqld
# 2013-08-30T14:16:10 [1169] Type: Debug
# 2013-08-30T14:16:10 [1169] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-30T14:16:10 [1169] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-30T14:16:10 [1169] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-30T14:16:10 [1169] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_l...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (26.9 KiB)

Transition crash while
==============================

perl runall-new.pl --basedir=/pxc --galera=ms --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --grammar=$HOME/galera_stress.yy --gendata=$HOME/galera_stress.zz --threads=16 --queries=50000 --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--lock_wait_timeout=50 --mem
# 2013-08-30T17:51:21 [48533] Using Log::Log4perl
# 2013-08-30T17:51:21 [48533] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-08-30T17:51:21 [48533] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-08-30T17:51:21 [48533] Starting
# 2013-08-30T17:51:21 [48533] # runall-new.pl \
# 2013-08-30T17:51:21 [48533] # --basedir=/pxc \
# 2013-08-30T17:51:21 [48533] # --galera=ms \
# 2013-08-30T17:51:21 [48533] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-08-30T17:51:21 [48533] # --grammar=/home/raghavendra/galera_stress.yy \
# 2013-08-30T17:51:21 [48533] # --gendata=/home/raghavendra/galera_stress.zz \
# 2013-08-30T17:51:21 [48533] # --threads=16 \
# 2013-08-30T17:51:21 [48533] # --queries=50000 \
# 2013-08-30T17:51:21 [48533] # --mysqld=--wsrep-provider=/pxc/lib/libgalera_smm.so \
# 2013-08-30T17:51:21 [48533] # --mysqld=--innodb_flush_method=O_DIRECT \
# 2013-08-30T17:51:21 [48533] # --mysqld=--lock_wait_timeout=50 \
# 2013-08-30T17:51:21 [48533] # --mem
# 2013-08-30T17:51:21 [48533] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-08-30T17:51:21 [48533] /media/Tintin/Work/code/randgen/pxc Revno: 906
# 2013-08-30T17:51:21 [48533] /media/Tintin/Work/code/randgen/pxc Revision-Id: <email address hidden>
# 2013-08-30T17:51:21 [48533] /pxc does not look like a bzr branch, cannot get revision info.
# 2013-08-30T17:51:21 [48533] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node0/data
# 2013-08-30T17:51:28 [48533] Creating MySQL 5.5.33 database at /pxc/mysql-test/var/node1/data
# 2013-08-30T17:51:45 [48533] MySQL Version:5.5.33
# 2013-08-30T17:51:45 [48533] Binary: /pxc/bin/mysqld
# 2013-08-30T17:51:45 [48533] Type: Debug
# 2013-08-30T17:51:45 [48533] Datadir: /pxc/mysql-test/var/node0/data
# 2013-08-30T17:51:45 [48533] Tmpdir: /pxc/mysql-test/var/node0/tmp
# 2013-08-30T17:51:45 [48533] Corefile: /pxc/mysql-test/var/node0/data/core.
# 2013-08-30T17:51:45 [48533] Starting MySQL 5.5.33: "/pxc/bin/mysqld" "--no-defaults" "--basedir=/pxc" "--datadir=/pxc/mysql-test/var/node0/data" "--lc-messages-dir=/pxc/share/mysql" "--character-sets-dir=/pxc/share/mysql/charsets" "--tmpdir=/pxc/mysql-test/var/node0/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19300" "--socket=/pxc/mysql-test/var/node0/mysql.sock" "--pid-file=/pxc/mysql-test/var/node0/mysql.pid" "--general-log" "--general-log-file=/pxc/mysql-test/var/node0/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_addr...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (7.1 KiB)

Based on the logs, it looks like a DDL MDL conflict may be
causing this.

 130902 22:13:29 [Note] WSREP: BF kill (1, seqno: 5216), victim: (25) trx: 7836
 130902 22:13:29 [Note] WSREP: Aborting query: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: victim 7836 in state 7
 130902 22:13:29 [Note] WSREP: MDL conflict db=test table=r ticket=4 solved by abort
 130902 22:13:29 [Note] WSREP: MDL conflict
 request: (14 seqno 5216 wsrep (2, 1, 0) cmd 3 9 DROP TABLE IF EXISTS r)
 granted: (25 seqno 5220 wsrep (0, 1, 7) cmd 3 31 SET AUTOCOMMIT=ON)
 130902 22:13:29 [Note] WSREP: MDL ticket: type: shared write space: TABLE db: test name: r (Waiting for table metadata lock)
 130902 22:13:29 [Note] WSREP: DROP caused BF abort
 130902 22:13:29 [Note] WSREP: MDL ticket: type: shared write space: TABLE db: test name: r (Waiting for table metadata lock)
 130902 22:13:29 [Note] WSREP: wsrep_abort_thd, by: 140481475598080, victim: 140481472669440
 130902 22:13:29 [Note] WSREP: abort transaction: BF: DROP TABLE IF EXISTS r victim: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: cluster conflict due to high priority abort for threads:
 130902 22:13:29 [Note] WSREP: Winning thread:
    THD: 14, mode: total order, state: executing, conflict: no conflict, seqno: 5216
    SQL: DROP TABLE IF EXISTS r
 130902 22:13:29 [Note] WSREP: Victim thread:
    THD: 25, mode: local, state: executing, conflict: cert failure, seqno: 5220
    SQL: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: BF kill (1, seqno: 5216), victim: (25) trx: 7836
 130902 22:13:29 [Note] WSREP: Aborting query: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: victim 7836 in state 7
 130902 22:13:29 [Note] WSREP: MDL victim killed at add_ticket
 130902 22:13:29 [Note] WSREP: cert failure, thd: 25 is_AC: 0, retry: 0 - 1 SQL: SET AUTOCOMMIT=ON
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5210, 0 : ALTER TABLE n RENAME `table1_innodb_compressed_int_autoinc`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5211, 0 : DROP TABLE `table1_innodb_default_key_pk_parts_2_int_autoinc`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5213, 0 : CREATE TABLE n LIKE `table1_innodb_compressed_key_pk_parts_2_int`
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5158, 0 : DROP TABLE IF EXISTS t
 130902 22:13:29 [Note] WSREP: empty rbr buffer, query: DROP TABLE IF EXISTS c
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5214, 0 : DROP TABLE IF EXISTS c
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_default_char.
 InnoDB: DEBUG: update_statistics for test/table1000_innodb_default_char.
 130902 22:13:29 [Note] WSREP: TO BEGIN: 5154, 0 : DROP TABLE IF EXISTS l
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Sort aborted: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [ERROR] /pxc/bin/mysqld: Sort aborted: Lock wait timeout exceeded; try restarting transaction
 130902 22:14:20 [Note] WSREP: TO BEGIN: 5212, 0 : DROP TABLE IF EXISTS y
 130902 22:14:20 [Note] WSREP: TO BEGIN: 5215, 0 : DROP TABLE IF EXISTS h

 I...

Read more...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Following grammars:

a)
$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['COMPRESSED','DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)'],
        indexes => [undef, 'key' ]
};

$data = {
        numbers => [ 'digit', 'null', undef ],
        strings => [ 'letter', 'english', 'char(12)' ]
}
b)

$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)', 'varchar', 'date', 'bigint', 'enum' ],
        indexes => [undef, 'key' ],
        charsets => ['utf8','latin1']
};

$data = {
        numbers => [ 'digit', 'null', undef ],
 temporals => ['date', 'year', 'null', undef ],
        enum => [ 'letter' ],
        strings => [ 'letter', 'english', 'char(12)', 'varchar(246)', 'varchar(100)' ]
}

still hang the server (after latest fix).

If timeout is used, everything works fine (till now).

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

After latest commit, the combined grammar of

$tables = {
        partitions => [ undef , 'KEY (pk) PARTITIONS 2' ],
        engines => [ 'InnoDB' ],
        pk => [ 'int', 'int auto_increment' ],
        rows => [ 1, 100, 1000 ],
        row_formats => ['COMPRESSED','DEFAULT']
};

$fields = {
        types => [ 'int', 'char(12)', 'varchar', 'date', 'bigint', 'enum' ],
        indexes => [undef, 'key' ],
        charsets => ['utf8','latin1']
};

$data = {
        numbers => [ 'digit', 'null', undef ],
 temporals => ['date', 'year', 'null', undef ],
        enum => [ 'letter' ],
        strings => [ 'letter', 'english', 'char(12)', 'varchar(246)', 'varchar(100)' ]
}

hangs.

Combination (b) doesn't

Combination (a) causes crash - reported here - lp:1123233

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Pushed fixes for releasing MDL after failed BEGIN/START TRANSACTION and "SET AUTOCOMMIT". Fix is in revision: http://bazaar.launchpad.net/~codership/codership-mysql/5.5-23/revision/3912

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

No hangs after last fix. Related details also posted here https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1123233/comments/40

Changed in percona-xtradb-cluster:
status: New → In Progress
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (35.9 KiB)

I re-added temporary table since https://bugs.launchpad.net/codership-mysql/+bug/1212247 has been fixed (since that was for temporary tables). Having temporary tables, produces the same deadlock

============================================================================================

Running with galera grammar (yy and zz attached).

# 2013-09-16T09:50:01 [20538] Copyright (c) 2010,2011 Oracle and/or its affiliates. All rights reserved. Use is subject to license terms.
# 2013-09-16T09:50:01 [20538] Please see http://forge.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework.
# 2013-09-16T09:50:01 [20538] Starting
# 2013-09-16T09:50:01 [20538] # runall-new.pl \
# 2013-09-16T09:50:01 [20538] # --basedir=/mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64 \
# 2013-09-16T09:50:01 [20538] # --galera=ms \
# 2013-09-16T09:50:01 [20538] # --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock \
# 2013-09-16T09:50:01 [20538] # --grammar=conf/galera/galera_stress.yy \
# 2013-09-16T09:50:01 [20538] # --gendata=conf/galera/galera_stress.zz \
# 2013-09-16T09:50:01 [20538] # --threads=16 \
# 2013-09-16T09:50:01 [20538] # --queries=50000 \
# 2013-09-16T09:50:01 [20538] # --mysqld=--wsrep-provider=/mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/lib/libgalera_smm.so \
# 2013-09-16T09:50:01 [20538] # --seed=time \
# 2013-09-16T09:50:01 [20538] # --mysqld=--innodb_flush_method=O_DIRECT
# 2013-09-16T09:50:01 [20538] master_port : 19300 slave_port : 19302 ports : 19300 19302 MTR_BUILD_THREAD : 930
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/pxcgen Revno: 915
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/pxcgen Revision-Id: <email address hidden>
# 2013-09-16T09:50:02 [20538] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64 does not look like a bzr branch, cannot get revision info.
# 2013-09-16T09:50:02 [20538] Creating MySQL 5.5.33 database at /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node0/data
# 2013-09-16T09:50:08 [20538] Creating MySQL 5.5.33 database at /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node1/data
# 2013-09-16T09:50:20 [20538] MySQL Version:5.5.33
# 2013-09-16T09:50:20 [20538] Binary: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/bin/mysqld
# 2013-09-16T09:50:20 [20538] Type: Release
# 2013-09-16T09:50:20 [20538] Datadir: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-test/var/node0/data
# 2013-09-16T09:50:20 [20538] Tmpdir: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/release/Host/centos6-64/Percona-XtraDB-Cluster-5.5.33-23.7.6.489.Linux.x86_64/mysql-te...

Changed in percona-xtradb-cluster:
milestone: 5.5.33-23.7.6 → future-5.5
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

I see deadlock with 5.6 tree as well, so the fix will need to ported there.

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :

Merged lp:~codership/codership-mysql/5.5-23 revs 3097, 3012 to lp:codership-mysql/5.6 (rev 3969).

See also https://bugs.launchpad.net/codership-mysql/+bug/1239571 which fixes deadlock caused by TOI DDL in 5.6 branch.

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1421

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.