pt-table-checksum issued a full table checksum ,and caused slave delay
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Expired
|
Undecided
|
Unassigned |
Bug Description
mysql master/slave version:
mysql> select version();
+-----------------+
| version() |
+-----------------+
| 5.5.52-38.3-log |
+-----------------+
1 row in set (0.00 sec)
pt-table-checksum version:
pt-table-checksum --version
pt-table-checksum 2.2.17
executed command:
pt-table-checksum --nocheck-
description:
I issued a command ,listed above ,to do master/slave checksum.It worked well at first and no slave delay.But after a while ,I found a big replication delay in slave.By executing show full processlist in slave ,I saw the sql_thread was running:
REPLACE INTO `test_checksum`
It was a full table checksum ,which lasted for about 1200 seconds, and the table mmm_tab sizes 100G,and the lag continued to increase.
I also found this statement in master's error log:
170612 17:16:33 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by th
e SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO `test_checksum`
x, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'xxx', 'mmm_tab', '234', 'PRIMARY', '48046052', '48269634', COUNT(*) AS cnt, COALESCE(
id`, `uid`, `type`, `data`, `flag`, `status`, `create_time`, `update_time`)) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `xxx`.`mmm_tab` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= '48046052')) AND ((`id
` <= '48269634')) /*checksum chunk*/
170612 17:16:34 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by th
e SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO `test_checksum`
x, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'xxx', 'mmm_tab', '235', 'PRIMARY', '48269635', '48505239', COUNT(*) AS cnt, COALESCE(
id`, `uid`, `type`, `data`, `flag`, `status`, `create_time`, `update_time`)) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `xxx`.`mmm_tab` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= '48269635')) AND ((`id
` <= '48505239')) /*checksum chunk*/
170612 17:16:39 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by th
e SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO `test_checksum`
x, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'xxx', 'mmm_tab', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(
`data`, `flag`, `status`, `create_time`, `update_time`)) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `xxx`.`mmm_tab` /*checksum table*/
170612 17:16:40 [Warning] Aborted connection 2832926 to db: 'xxx' user: 'dba_checksum' host: 'xx.xx.xx.xx' (Got an error reading communication packets)
170612 17:16:41 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by th
e SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO `test_checksum`
x, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'xxx', 'mmm_tab', '236', 'PRIMARY', '48505240', '48746455', COUNT(*) AS cnt, COALESCE(
id`, `uid`, `type`, `data`, `flag`, `status`, `create_time`, `update_time`)) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `xxx`.`mmm_tab` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= '48505240')) AND ((`id
` <= '48746455')) /*checksum chunk*/
It seemed that,normally pt-table-checksum would issue sql statement by force index(primary) where id xxx and xxx, but in this case, it issued a full table checksum.
in the error log,we can see : pt-table-checksum issued a full table checksum without a where condition between mmm_tab chunk 235 and 236.