Percona XtraDB Cluster - HA scalable solution for MySQL

Protocol::end_statement(): Assertion `0' failed

Reported by Raghavendra D Prabhu on 2013-08-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Undecided
Teemu Ollakka
Percona XtraDB Cluster
Status tracked in Trunk
5.6
Undecided
Unassigned
Trunk
Undecided
Unassigned

Bug Description

Following crash occurs on a particular workload:

===============================================================================

InnoDB: DEBUG: update_statistics for test/o#P#p1.
InnoDB: DEBUG: update_statistics for test/#sql-54bf_8#P#p0.
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Deadlock found when trying to get lock; try restarting transaction
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Sort aborted: Deadlock found when trying to get lock; try restarting transaction
InnoDB: DEBUG: update_statistics for test/#sql-54bf_8#P#p1.
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Deadlock found when trying to get lock; try restarting transaction
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Sort aborted: Deadlock found when trying to get lock; try restarting transaction
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Deadlock found when trying to get lock; try restarting transaction
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Sort aborted: Deadlock found when trying to get lock; try restarting transaction
InnoDB: DEBUG: update_statistics for test/#sql-54bf_8#P#p0.
InnoDB: DEBUG: update_statistics for test/#sql-54bf_8#P#p1.
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Deadlock found when trying to get lock; try restarting transaction
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Sort aborted: Deadlock found when trying to get lock; try restarting transaction
InnoDB: DEBUG: update_statistics for test/table100_innodb_int.
InnoDB: There was a problem in converting'table0_innodb_key_pk_parts_2_int#P#p0' in charset filename to charset utf8_general_ciInnoDB: DEBUG: update_statistics for test/h#P#p1.
InnoDB: There was a problem in converting'table0_innodb_key_pk_parts_2_int#P#p1' in charset filename to charset utf8_general_ciInnoDB: DEBUG: update_statistics for test/table0_innodb_key_pk_parts_2_int#P#p0.
InnoDB: DEBUG: update_statistics for test/table0_innodb_key_pk_parts_2_int#P#p1.
InnoDB: DEBUG: update_statistics for test/table1_innodb_key_pk_parts_2_int_autoinc#P#p0.
InnoDB: DEBUG: update_statistics for test/table1_innodb_key_pk_parts_2_int_autoinc#P#p1.
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Got error -1 from storage engine
130809 7:45:19 [ERROR] /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld: Sort aborted: Query execution was interrupted
mysqld: /home/jenkins/workspace/build-xtradb-cluster-binaries/BUILD_TYPE/debug/label_exp/centos6-64/Percona-XtraDB-Cluster-5.5.32/sql/protocol.cc:526: void Protocol::end_statement(): Assertion `0' failed.
07:45:19 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=19
max_threads=153
thread_count=19
connection_count=19
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 343052 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x363d460
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fd8501a6d98 thread_stack 0x40000
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(my_print_stacktrace+0x35)[0x80dcf5]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(handle_fatal_signal+0x314)[0x6d2ef8]
/lib64/libpthread.so.0[0x395d40f500]
/lib64/libc.so.6(gsignal+0x35)[0x395d0328a5]
/lib64/libc.so.6(abort+0x175)[0x395d034085]
/lib64/libc.so.6[0x395d02ba1e]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x395d02bae0]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(_ZN8Protocol13end_statementEv+0x25b)[0x530bb1]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x189a)[0x5b53c8]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(_Z10do_commandP3THD+0x6e0)[0x5b670c]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x137)[0x65d586]
/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld(handle_one_connection+0x52)[0x65d786]
/lib64/libpthread.so.0[0x395d407851]
/lib64/libc.so.6(clone+0x6d)[0x395d0e890d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fd7dc004c10): DELETE IGNORE FROM `table10000_innodb_int_autoinc` WHERE `col_int_key` IN ( 'a' , 4 , 'm' , 'c' , 1 ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 4
Connection ID (thread ID): 14
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file
# 2013-08-09T07:45:22 [21379] datadir is /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/mysql-test/var/node0/data/
# 2013-08-09T07:45:22 [21379] binary is /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/../Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin/mysqld
# 2013-08-09T07:45:22 [21379] bindir is /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/../Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/bin
# 2013-08-09T07:45:22 [21379] core is /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/mysql-test/var/node0/data/core.21695
Missing separate debuginfo for /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/lib/libgalera_smm.so
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/4f/2008dcf03d5b6eea1276ab5975898133bb0895
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/8e/312e8752e924c26341440ec3a032bc0e20cba3
16 /home/jenkins/workspace/build-xtradb-cluster-binaries/BUILD_TYPE/debug/label_exp/centos6-64/Percona-XtraDB-Cluster-5.5.32/sql/main.cc: No such file or directory.

========================================================================================

Related branches

Crash log with full backtrace

The patch here - https://bazaar.launchpad.net/~codership/codership-mysql/wsrep-5.5/revision/3855 - can be used to prevent debug builds from crashing for now (to get to other bugs).

Download full text (29.7 KiB)

Got crash in similar context:

130810 21:29:15 [Warning] WSREP: wsrep_sst_receive_address is set to '127.0.0.1:4952' which makes it impossible for another host to reach this one. Please set it to the address which this node can be connected at by other cluster members.
130810 21:29:15 [Note] WSREP: Read nil XID from storage engines, skipping position init
130810 21:29:15 [Note] WSREP: wsrep_load(): loading provider library '/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/Percona-XtraDB-Cluster-5.5.32-23.7.5.455.Linux.x86_64/lib/libgalera_smm.so'
130810 21:29:15 [Note] WSREP: wsrep_load(): Galera 2.6(r152) by Codership Oy <email address hidden> loaded succesfully.
130810 21:29:15 [Warning] WSREP: Could not open saved state file for reading: /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/19/tmp.8Y31eZntFq/current1_4/node0/data//grastate.dat
130810 21:29:15 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130810 21:29:15 [Note] WSREP: Preallocating 134219048/134219048 bytes in '/mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/19/tmp.8Y31eZntFq/current1_4/node0/data//galera.cache'...
130810 21:29:17 [Note] WSREP: Passing config to GCS: base_host = 127.0.0.1; base_port = 4567; cert.log_conflicts = no; gcache.dir = /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/19/tmp.8Y31eZntFq/current1_4/node0/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /mnt/workspace/PXC-5.5-mrandgen/BTYPE/debug/Host/centos6-64/19/tmp.8Y31eZntFq/current1_4/node0/data//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130810 21:29:17 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130810 21:29:17 [Note] WSREP: wsrep_sst_grab()
130810 21:29:17 [Note] WSREP: Start replication
130810 21:29:17 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130810 21:29:17 [Note] WSREP: protonet asio version 0
130810 21:29:17 [Note] WSREP: backend: asio
130810 21:29:17 [Note] WSREP: GMCast version 0
130810 21:29:17 [Note] WSREP: (e965cfcd-0203-11e3-a79f-0f51a1ebdb52, 'tcp://127.0.0.1:4949') listening at tcp://127.0.0.1:4949
130810 21:29:17 [Note] WSREP: (e965cfcd-0203-11e3-a79f-0f51a1ebdb52, 'tcp://127.0.0.1:4949') multicast: , ttl: 1
130810 21:29:17 [Note] WSREP: EVS version 0
130810 21:29:17 [Note] WSREP: PC version 0
130810 21:29:17 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer ''
130810 21:29:17 [Note] WSREP: Node e965cfcd-0203-11e3-a79f-0f51a1ebdb52 state prim
130810 21:29:17 [Note] WSREP: view(view_id(PRIM,e965cfcd-0203-11e3-a79f-0f51a1ebdb52,1) memb {
 e965cfcd-0203-11e3-a79f-0f51a1ebdb52,
} joined {
} left {
} partitioned {
})
130810 21:29:17 [Note] WSREP: gcomm: connected
130810 21:29:17 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
130810 21:29:17 [Note] WSREP: Shifting CLO...

Attached vardir for the previous crash.

Creating debug builds with the WSREP_WARN so that more info can be obtained.

Got this:

01:29:01 130811 1:28:54 [Warning] WSREP: Should not be here:
01:29:01 DA status 0,
01:29:01 wsrep_applier 0,
01:29:01 wsrep_exec_mode 0,
01:29:01 wsrep_query_state 1,
01:29:01 wsrep_conflict_state 0

Full log here http://jenkins.percona.com/job/PXC-5.5-mrandgen/BTYPE=debug,Host=centos6-64/20/consoleText

Teemu Ollakka (teemu-ollakka) wrote :

Tested with branch lp:codership-mysql/5.5:
Warning "should not be here" seems to appear each time transaction replaying ends. This hints that my_ok() call is missing from wsrep_replay_transaction().

Could not reproduce with 5.5-23 branch.

Changed in codership-mysql:
status: New → Confirmed
assignee: nobody → Teemu Ollakka (teemu-ollakka)

@Vadim

Thanks, but in that bug crash was due to 'UPDATE IGNORE' whereas here it is due to 'DELETE IGNORE'. Still, seems removing IGNORE seems to have fixed it there. So, I will let this remain duplicate, will test again after merging the fix from that issue and if recurs again, will reopen this.

@Teemu,

Thanks, I see your fix in 5.6 tree, will test there as well.

Unduplicating this for 5.6 as per #8.

no longer affects: percona-xtradb-cluster/5.6

Actually, #8 is only about codership/5.5, so 5.6 is not affected, keeping this issue open till the WSREP_WARN is removed and/or tested after merging with PS 5.5

Download full text (8.8 KiB)

Hitting this with PS 5.6 - due to https://bugs.launchpad.net/percona-server/+bug/1168265

"/media/Tintin/Work/code/randgen/pxc" is not a core dump: Is a directory
backtrace.gdb:19: Error in sourced command file:
No stack.
# 2013-10-03T01:59:25 [6270] Whether to confirm potentially dangerous operations is on.
"/media/Tintin/Work/code/randgen/pxc" is not a core dump: Is a directory
# 2013-10-03T01:59:26 [6270] Whether to confirm potentially dangerous operations is on.
# 2013-10-03T01:59:26 [6270] The last 100 lines from /pxc56/mysql-test/var/node0/data//../mysql.err :
2013-10-03 01:59:04 6691 [Note] WSREP: Aborting query: START TRANSACTION
2013-10-03 01:59:04 6691 [Note] WSREP: withdraw for BF trx: 4038, state: 4
2013-10-03 01:59:04 6691 [Note] WSREP: victim 4038 in state 4
WSREP: conc slot cancel not supported
2013-10-03 01:59:04 6691 [Note] WSREP: MDL conflict
request: (13 seqno 271 wsrep (2, 1, 0) cmd 3 9 DROP TABLE `table1000_innodb_int`)
granted: (16 seqno 267 wsrep (0, 1, 1) cmd 3 6 INSERT INTO p SELECT * FROM `table1000_innodb_int`)
2013-10-03 01:59:04 6691 [Note] WSREP: MDL ticket: type: shared read space: TABLE db: test name: table1000_innodb_int
2013-10-03 01:59:04 6691 [Note] WSREP: DROP caused BF abort
2013-10-03 01:59:04 6691 [Note] WSREP: MDL ticket: type: shared read space: TABLE db: test name: table1000_innodb_int
2013-10-03 01:59:04 6691 [Note] WSREP: wsrep_abort_thd, by: 139989696304896, victim: 139989694711552
2013-10-03 01:59:04 6691 [Note] WSREP: abort transaction: BF: DROP TABLE `table1000_innodb_int` victim: INSERT INTO p SELECT * FROM `table1000_innodb_int`
2013-10-03 01:59:04 6691 [Note] WSREP: cluster conflict due to high priority abort for threads:
2013-10-03 01:59:04 6691 [Note] WSREP: Winning thread:
   THD: 13, mode: total order, state: executing, conflict: no conflict, seqno: 271
   SQL: DROP TABLE `table1000_innodb_int`
2013-10-03 01:59:04 6691 [Note] WSREP: Victim thread:
   THD: 16, mode: local, state: executing, conflict: must abort, seqno: 267
   SQL: INSERT INTO p SELECT * FROM `table1000_innodb_int`
2013-10-03 01:59:04 6691 [Note] WSREP: BF kill (1, seqno: 271), victim: (16) trx: 4210
2013-10-03 01:59:04 6691 [Note] WSREP: Aborting query: INSERT INTO p SELECT * FROM `table1000_innodb_int`
2013-10-03 01:59:04 6691 [Note] WSREP: victim 4210 in MUST ABORT state
WSREP: conc slot cancel not supported
2013-10-03 01:59:04 6691 [Note] WSREP: client rollback due to BF abort for (14), query: DELETE IGNORE FROM `table1000_innodb_int` WHERE `col_int_key` IN ( 620298240 , 3424452608 , 214 , 144 , 7 ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 3
2013-10-03 01:59:04 6691 [Note] WSREP: abort in exec query state, avoiding autocommit
2013-10-03 01:59:04 6691 [Note] WSREP: BF Aborted, thd: 14 is_AC: 1, retry: 1 - 1 SQL: DELETE IGNORE FROM `table1000_innodb_int` WHERE `col_int_key` IN ( 620298240 , 3424452608 , 214 , 144 , 7 ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 3
2013-10-03 01:59:04 6691 [Note] WSREP: releasing retry_query: conf 0 sent 0 kill 0 errno 0 SQL DELETE IGNORE FROM `table1000_innodb_int` WHERE `col_int_key` IN...

Read more...

Repeatable even with 5.6.15 merge.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers