pt-archiver --bulk-delete does not imply --commit-each as intended

Bug #1610259 reported by Steven
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
New
Undecided
Unassigned

Bug Description

pt-archiver version 2.2.15
MySQL version: Percona Server 5.6.28-76.1
Storage Engine: TokuDB

When using --bulk-delete and --limit options together, one would assume that each chunk of rows is committed, based on the documentation.

   From pt-archiver 2.2 documentation:

   --bulk-delete
   Delete each chunk with a single statement (implies --commit-each).
   ...

   --commit-each
   ...
   Commits transactions and flushes --file after each set of rows has been archived, before fetching
   the next set of rows, and before sleeping if --sleep is specified. Disables --txn-size;
   use --limit to control the transaction size with --commit-each.
   ...

However, unless you explicitly pass the --commit-each option on the command line in this scenario, pt-archiver is performing a commit on every single row in each chunk. Most use cases probably won't even notice the additional overhead caused by this, but when trying to purge hundreds of millions of rows each day, the added time for this commit overhead becomes significant.

To reproduce:
mysql> use debug;
mysql> create table a (id int unsigned not null primary key auto_increment) ENGINE=TokuDB;
mysql> insert into a values(); -- repeat 100 times
[user@device /]$ pt-archiver --purge --primary-key-only --bulk-delete --bulk-delete-limit --limit 100000 --progress 1000000 --no-check-charset --safe-auto-increment --set-vars wait_timeout=86400 --statistics --why-quit --source h=localhost,S=/path/to/percona.sock,D=debug,t=a,u=user,p=pass --where "id <= 100"

Started at 2016-08-05T08:36:02, ended at 2016-08-05T08:36:02
Source: D=debug,S=/path/to/percona.sock,h=localhost,p=...,t=a,u=user
SELECT 99
INSERT 0
DELETE 99
Action Count Time Pct
commit 100 0.0189 65.08
bulk_deleting 1 0.0030 10.20
select 2 0.0015 5.04
other 0 0.0057 19.68
Exiting because there are no more rows.

Now repopulate the table with 100 rows, and run pt-archiver again, but this time explicitly pass the --commit-each option on the command line:

[user@device /]$ pt-archiver --commit-each --purge --primary-key-only --bulk-delete --bulk-delete-limit --limit 100000 --progress 1000000 --no-check-charset --safe-auto-increment --set-vars wait_timeout=86400 --statistics --why-quit --source h=localhost,S=/path/to/percona.sock,D=debug,t=a,u=user,p=pass --where "id <= 100"

Started at 2016-08-05T08:37:31, ended at 2016-08-05T08:37:31
Source: D=debug,S=/path/to/percona.sock,h=localhost,p=...,t=a,u=user
SELECT 99
INSERT 0
DELETE 99
Action Count Time Pct
commit 2 0.0010 37.77
select 2 0.0006 22.92
bulk_deleting 1 0.0005 18.09
other 0 0.0005 21.22
Exiting because there are no more rows.

The reason seems to be related to the following code in sub main {}:
   if ( $bulk_del || $o->get('bulk-insert') ) {
      $o->set('commit-each', 1);
   }
There are multiple lines further down in sub main {} which then use the variable $commit_each to determine whether or not to commit:
      ...
      my $ac = !$txnsize && !$commit_each;
      ...
      commit($o) unless $commit_each;
      ...
   commit($o, $txnsize || $commit_each);

The problem is that $commit_each is still set to 0 because $o->set('commit-each', 1); is not setting the $commit_each variable to 1, it's setting the $o->get('commit-each') option to 1.

To fix this, I simply changed:
   if ( $bulk_del || $o->get('bulk-insert') ) {
      $o->set('commit-each', 1);
   }
..to now read:
   if ( $bulk_del || $o->get('bulk-insert') ) {
      $commit_each = 1;
   }

Now repopulate the table with 100 rows, and run pt-archiver again, but this time do NOT explicitly pass the --commit-each option on the command line:

[user@device /]$ pt-archiver --purge --primary-key-only --bulk-delete --bulk-delete-limit --limit 100000 --progress 1000000 --no-check-charset --safe-auto-increment --set-vars wait_timeout=86400 --statistics --why-quit --source h=localhost,S=/path/to/percona.sock,D=debug,t=a,u=user,p=pass --where "id <= 100"

Started at 2016-08-05T08:39:26, ended at 2016-08-05T08:39:26
Source: D=debug,S=/path/to/percona.sock,h=localhost,p=...,t=a,u=user
SELECT 99
INSERT 0
DELETE 99
Action Count Time Pct
commit 2 0.0007 30.59
select 2 0.0005 20.01
bulk_deleting 1 0.0004 17.05
other 0 0.0007 32.35
Exiting because there are no more rows.

SUCCESS!

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1377

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.