pt-archiver causing statements to appear non-deterministic

Bug #1724702 reported by Brad
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Won't Fix
Medium
Unassigned

Bug Description

When using binlog_format=MIXED the pt-archiver's delete statements are being saved as ROW format. This can be problematic so I attempted to force statement format using --set-vars binlog_format='STATEMENT', however the commands logged do not appear to be deterministic.

I am using pt-archiver 3.0.4:

I created and filled a table with:
CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO joinit VALUES (NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )));
INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit;

The pt-archiver I test with was:

pt-archiver --source h=localhost,D=percona,t=joinit,P=3306 --where "i<500" --bulk-delete --limit=100 --progress=1000 --no-check-charset --why-quit --sleep-coef=0.2 --sentinel=/tmp/guest_line_item.archiver.stop --set-vars binlog_format='STATEMENT' --purge

The binlog entries were similar to:
# at 499
#171018 12:23:11 server id 1 end_log_pos 679 CRC32 0x2f3266cd Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1508343791/*!*/;
DELETE FROM `percona`.`joinit` WHERE (((`i` >= '158'))) AND (((`i` <= '320'))) AND (i<500) LIMIT 100

Assuming the limit 100 is necessary would mean this could cause different results on the slave than a master, especially since it does not include an order by clause. If the limit 100 is redundant, which is my assumption, then the limit should not be added.

Tags: i209125
Revision history for this message
Brad (bradley.mickel) wrote :

Pt-archiver attempts to make the statements deterministic, however there are situations where this is not feasible.

suppose you have this table:
CREATE SCHEMA IF NOT EXISTS `test`;
USE `test`;
DROP TABLE IF EXISTS `t1`;

CREATE TABLE `test`.`t1` (
    t1id INT(11) NOT NULL,
    t2id INT(11) NOT NULL,
    t3id INT(11) NOT NULL,
    description varchar(30),
    KEY idx1 (t1id, t2id , t3id)
) ENGINE=InnoDB;

INSERT INTO `test`.`t1` VALUES (1,1,1,"a"), (1,1,2,"b"), (1,1,3,"c");

pt_archiver:6338 15477 Index for DELETE: idx1
# TableNibbler:3098 15477 Will ascend index idx1
# TableNibbler:3110 15477 Will ascend columns t1id, t2id, t3id
# TableNibbler:3121 15477 Will ascend, in ordinal position: 0, 1, 2

pt-archiver will use the composite index for the deletes, however it is not possible to calculate the exact range.

In the example provided the primary key is a single int field, and so pt-archiver is able to identify the appropriate range, and the limit clause is redundant. There is an option to disable the limit clause, however it should be used when you are absolutely certain that the statements will be deterministic.

from the docs:
This is an advanced option and you should not disable it unless you know what you are doing and why! By default, --bulk-delete appends a --limit clause to the bulk delete SQL statement. In certain cases, this clause can be omitted by specifying --no-bulk-delete-limit. --limit must still be specified.

tags: added: i209125
Changed in percona-toolkit:
importance: Undecided → Medium
status: New → Triaged
status: Triaged → Won't Fix
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-739

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.