Deadlock with pt-table-sync --execute on PXC

Bug #1250450 reported by Daniël van Eeden on 2013-11-12
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit
Undecided
Unassigned

Bug Description

Setup:
PXC Cluster1: galera1, galera2, galera3
PXC Cluster2: galera4, galera5, galera6

galera4 is a slave of galera1.

PXC Version: 5.5.34-23.7.6-565.precise
Percona Toolkit Version: 2.2.5.

A checksum between the master (galera1) and the slave (galera4) works, but "pt-table-sync --execute" results in a deadlock. Executing the statments gathered with "pt-table-sync --print" works fine.

"show engine innodb status" does not show deadlocks on galera4, so it are deadlocks from galera, not from InnoDB. (Innodb_deadlocks status is also 0)

root@galera4:~# pt-table-checksum --recursion-method=cluster -d test1 h=galera1,u=percona,p=percona
Not checking replica lag on galera2 because it is a cluster node.
Not checking replica lag on galera3 because it is a cluster node.
            TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
11-12T12:56:33 0 0 2 1 0 0.063 test1.t1

root@galera4:~# pt-table-sync --print --replicate percona.checksums -d test1 h=galera1,u=percona,p=percona
REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('1', 'test-galera2-2') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15654 user:root host:galera4*/;
REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('4', 'test-galera3') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15654 user:root host:galera4*/;
root@galera4:~# pt-table-sync --execute --replicate percona.checksums -d test1 h=galera1,u=percona,p=percona
Deadlock found when trying to get lock; try restarting transaction at line 6115 while doing test1.t1 on galera4
root@galera4:~#
root@galera4:~# pt-table-sync --print --replicate percona.checksums --sync-to-master -d test1 h=galera4,u=percona,p=percona
REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('1', 'test-galera2-2') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:P=3306,h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15664 user:root host:galera4*/;
REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('4', 'test-galera3') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:P=3306,h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15664 user:root host:galera4*/;
root@galera4:~# pt-table-sync --execute --replicate percona.checksums --sync-to-master -d test1 h=galera4,u=percona,p=percona
Deadlock found when trying to get lock; try restarting transaction at line 6115 while doing test1.t1 on galera4
root@galera4:~#
root@galera4:~# tail -f /var/lib/mysql/galera4.log &
[1] 15675
root@galera4:~# 56 Query SET @crc := '', @cnt := 0
     56 Query commit
     56 Query SELECT MASTER_POS_WAIT('galera1-bin.000002', 62308, 60)
     56 Query SELECT /*test1.t1:1/1*/ 0 AS chunk_num, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, `name`, CONCAT(ISNULL(`name`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `test1`.`t1` FORCE INDEX (`PRIMARY`) WHERE (1=1) LOCK IN SHARE MODE
     56 Query SET @crc := '', @cnt := 0
     56 Query SELECT /*rows in chunk*/ `id`, `name`, CRC32(CONCAT_WS('#', `id`, `name`, CONCAT(ISNULL(`name`)))) AS __crc FROM `test1`.`t1` FORCE INDEX (`PRIMARY`) WHERE (1=1) ORDER BY `id` LOCK IN SHARE MODE
     56 Query commit
     56 Quit
     57 Query commit
     57 Quit

root@galera4:~#
root@galera4:~#
root@galera4:~# pt-table-sync --execute --replicate percona.checksums -d test1 h=galera1,u=percona,p=percona
131112 13:17:13 58 Connect percona@galera4 on
     58 Query set autocommit=1
     58 Query SELECT @@SQL_MODE
     58 Query SHOW VARIABLES LIKE 'wait\_timeout'
     58 Query SET SESSION wait_timeout=10000
     58 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO'*/
     58 Query SELECT @@SERVER_ID
     58 Query SELECT db, tbl, CONCAT(db, '.', tbl) AS `table`, chunk, chunk_index, lower_boundary, upper_boundary, COALESCE(this_cnt-master_cnt, 0) AS cnt_diff, COALESCE(this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc), 0) AS crc_diff, this_cnt, master_cnt, this_crc, master_crc FROM percona.checksums WHERE master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)
     59 Connect percona@galera4 on
     59 Query set autocommit=0
     59 Query SELECT @@SQL_MODE
     59 Query SHOW VARIABLES LIKE 'wait\_timeout'
     59 Query SET SESSION wait_timeout=10000
     59 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO'*/
     59 Query /*!40101 SET @@SQL_MODE := CONCAT(@@SQL_MODE, ',NO_AUTO_VALUE_ON_ZERO')*/
     59 Query SHOW VARIABLES LIKE 'version%'
     59 Query SHOW ENGINES
     59 Query SHOW VARIABLES LIKE 'innodb_version'
     59 Query SELECT @@binlog_format
     59 Query SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
     58 Query SHOW TABLES FROM `test1` LIKE 't1'
     58 Query SHOW VARIABLES LIKE 'version%'
     58 Query SHOW ENGINES
     58 Query SHOW VARIABLES LIKE 'innodb_version'
     58 Query /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := REPLACE(REPLACE(@@SQL_MODE, 'ANSI_QUOTES', ''), ',,', ','), @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
     58 Query SHOW TRIGGERS FROM `test1`
     58 Query /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
     58 Query set autocommit=0
     58 Query SELECT CRC32('test-string')
     58 Query USE `test1`
     58 Query SET @crc := '', @cnt := 0
     58 Query commit
     58 Query SELECT MASTER_POS_WAIT('galera1-bin.000002', 62308, 60)
     58 Query SELECT /*test1.t1:1/1*/ 0 AS chunk_num, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, `name`, CONCAT(ISNULL(`name`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `test1`.`t1` FORCE INDEX (`PRIMARY`) WHERE (1=1) LOCK IN SHARE MODE
     58 Query SET @crc := '', @cnt := 0
     58 Query SELECT /*rows in chunk*/ `id`, `name`, CRC32(CONCAT_WS('#', `id`, `name`, CONCAT(ISNULL(`name`)))) AS __crc FROM `test1`.`t1` FORCE INDEX (`PRIMARY`) WHERE (1=1) ORDER BY `id` LOCK IN SHARE MODE
Deadlock found when trying to get lock; try restarting transaction at line 6115 while doing test1.t1 on galera4
     58 Query commit
     58 Quit
     59 Query commit
     59 Quit
root@galera4:~# fg
tail -f /var/lib/mysql/galera4.log
^C
root@galera4:~# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 61
Server version: 5.5.34-34.1-log Percona XtraDB Cluster (GPL), Release 34.1, wsrep_23.7.6.r3936

Copyright (c) 2009-2013 Percona LLC and/or its affiliates
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@galera4 [(none)] > REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('1', 'test-galera2-2') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15654 user:root host:galera4*/;
Query OK, 2 rows affected (0.01 sec)

root@galera4 [(none)] > REPLACE INTO `test1`.`t1`(`id`, `name`) VALUES ('4', 'test-galera3') /*percona-toolkit src_db:test1 src_tbl:t1 src_dsn:h=galera1,p=...,u=percona dst_db:test1 dst_tbl:t1 dst_dsn:h=galera4,p=...,u=percona lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15654 user:root host:galera4*/;
Query OK, 1 row affected (0.00 sec)

root@galera4 [(none)] > \q
Bye
root@galera4:~# pt-table-sync --print --replicate percona.checksums -d test1 h=galera1,u=percona,p=percona
root@galera4:~#

Jaime Sicam (jssicam) wrote :

Tested on PXC 5.6.20 and pt-table-sync 2.2.11:

Created two PXC clusters where one node of a cluster is a asynchronous slave of one node on the other cluster

[root@localhost ~]# pt-table-sync --execute --sync-to-master h=192.168.1.97
Deadlock found when trying to get lock; try restarting transaction at line 6143 while doing employees.employees on 192.168.1.97

Error log on Master:
2014-11-14 04:40:29 3962 [Warning] WSREP: SQL statement was ineffective, THD: 11, buf: 667
QUERY: commit
 => Skipping replication

Changed in percona-toolkit:
status: New → Confirmed
tags: added: i61811
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers