Add feature forcing DELETE by primary key to pt-archiver

Bug #1452902 reported by Sveta Smirnova
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

Currently pt-archiver resolves WHERE clause twice: first, when run SELECT to retrieve rows for bulk insert and then when runs DELETE.

For example, for my test run general query log contains:

SELECT /*!40001 SQL_NO_CACHE */ `id`,`f1`,`f2`,`f3`,`f4`,`f5`,`f6`,`f7`,`f8`,`f9`,`f10`,`f11`,`f12`,`f13`,`f14`,`f15`,`f16`,`f17`,`f18` FROM `test`.`attachments` FORCE INDEX(`PRIMARY`) WHERE (f13='8d5e957f297893487bd98fa830fa6413') ORDER BY `id` LIMIT 1000;

then

DELETE QUICK FROM `test`.`attachments` WHERE (((`id` >= '31002'))) AND (((`id` <= '144987'))) AND (f13='8d5e957f297893487bd98fa830fa6413') LIMIT 1000;

But this is not effective for certain tables and conditions and can be avoided if option mysql_use_result used and DELETE is done via PRIMARY KEY access only.

Suggested fix: implement option, asking pt-archiver to force DELETE by PRIMARY KEY.

Revision history for this message
Sveta Smirnova (svetasmirnova) wrote :

Also using mysql_use_result makes sorting with ORDER BY redundant and archiving process more effective.

Revision history for this message
Sveta Smirnova (svetasmirnova) wrote :

Version 2.2.14

Changed in percona-toolkit:
status: New → Confirmed
tags: added: i59430
Revision history for this message
Regev Brody (regevbr) wrote :

We have the same issue, we want the ascending optimization to use a non primary key index, but use the primary key to delete (which is pretty obvious when a primary key does exist)
The results of such operation is that the command ignores the --columns flag and retrieves all the fields and use them all for the delete operation. Now, mysql will use the primary key to delete but still the overhead of retrieving all the fields and sending them back on the delete operation is redundant.

Example of issue:

CREATE TABLE job_entity
    (
    pk BINARY(16) NOT NULL,
    pool_master SMALLINT UNSIGNED NOT NULL,
    full_id_iteration VARCHAR(50) NOT NULL,
    content MEDIUMBLOB NOT NULL,
    run_time MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
    start_time BIGINT UNSIGNED NOT NULL,
    finish_time BIGINT UNSIGNED NOT NULL DEFAULT 0,
    status TINYINT UNSIGNED NOT NULL,
    modified BIGINT UNSIGNED NOT NULL,
    exit_status SMALLINT,
    PRIMARY KEY (pk, pool_master))
ENGINE=InnoDB;

CREATE INDEX finish_time_index ON job_entity (finish_time) USING BTREE;
CREATE INDEX modified_index ON job_entity (modified) USING BTREE;
CREATE INDEX queries_index ON job_entity (pool_master,status,start_time) USING BTREE;
CREATE UNIQUE INDEX job_index ON job_entity (pool_master,full_id_iteration) USING BTREE;

##pt-archiver --host <hostname> --port <port> --user <user> --password <pass> --dry-run --progress 1 --statistics --noversion-check --purge --where 'pool_master=1 and finish_time > 0 and finish_time < 1000' --source D=db,t=job_entity,i=finish_time_index --nocheck-charset --ascend-first --columns 'pool_master,finish_time,full_id_iteration'

SELECT /*!40001 SQL_NO_CACHE */ `pool_master`,`finish_time`,`full_id_iteration`,`pk`,`content`,`run_time`,`start_time`,`status`,`modified`,`exit_status` FROM `nbodataservice`.`job_entity` FORCE INDEX(`finish_time_index`) WHERE (pool_master=1 and finish_time > 0 and finish_time < 1000) ORDER BY `finish_time` LIMIT 1
SELECT /*!40001 SQL_NO_CACHE */ `pool_master`,`finish_time`,`full_id_iteration`,`pk`,`content`,`run_time`,`start_time`,`status`,`modified`,`exit_status` FROM `nbodataservice`.`job_entity` FORCE INDEX(`finish_time_index`) WHERE (pool_master=1 and finish_time > 0 and finish_time < 1000) AND ((`finish_time` >= ?)) ORDER BY `finish_time` LIMIT 1
DELETE FROM `nbodataservice`.`job_entity` WHERE (`pk` = ? AND `pool_master` = ? AND ((? IS NULL AND `full_id_iteration` IS NULL) OR (`full_id_iteration` = ?)) AND `content` = ? AND `run_time` = ? AND `start_time` = ? AND `finish_time` = ? AND `status` = ? AND `modified` = ? AND ((? IS NULL AND `exit_status` IS NULL) OR (`exit_status` = ?))) LIMIT 1

Revision history for this message
Regev Brody (regevbr) wrote :

Looking at line 3221:

   my $index = $tp->find_best_index($tbl, $args{index});
   die "Cannot find an ascendable index in table" unless $index;

   if ( $index && $tbl->{keys}->{$index}->{is_unique}) {
      @del_cols = @{$tbl->{keys}->{$index}->{cols}};
   }
   else {
      @del_cols = @{$tbl->{cols}};
   }
   PTDEBUG && _d('Columns needed for DELETE:', join(', ', @del_cols));

I think the solution will be that if the index is no unique, use the primary key if exists as the deletion columns

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1281

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.