pt-archiver deletes data despite --dry-run

Bug #1199589 reported by Jervin R on 2013-07-10
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona Toolkit
Status tracked in 2.2
2.1
Critical
Daniel Nichter
2.2
Critical
Daniel Nichter

Bug Description

You can see on the last iteration of the pt-archiver command that --purge --dry-run is in the middle of the other options and it was not honored.

[revin@forge ~]$ pt-archiver --version
pt-archiver 2.2.3

[revin@forge ~]$ pt-archiver --purge --dry-run --set-vars=innodb_lock_wait_timeout=5 --txn-size=1000 --statistics --bulk-delete --limit=1000 --optimize --primary-key-only --source=h=localhost,S=/tmp/mysql_sandbox55320.sock,u=msandbox,p=msandbox,D=test,t=ft_history --where='h_id > 900000'
SELECT /*!40001 SQL_NO_CACHE */ `h_id`,`u_id`,`cn_id`,`f_id`,`h_date`,`h_ip`,`h_agent` FROM `test`.`ft_history` FORCE INDEX(`PRIMARY`) WHERE (h_id > 900000) AND (`h_id` < '1991667') LIMIT 1000
SELECT /*!40001 SQL_NO_CACHE */ `h_id`,`u_id`,`cn_id`,`f_id`,`h_date`,`h_ip`,`h_agent` FROM `test`.`ft_history` FORCE INDEX(`PRIMARY`) WHERE (h_id > 900000) AND (`h_id` < '1991667') AND ((`h_id` >= ?)) LIMIT 1000
DELETE FROM `test`.`ft_history` WHERE (((`h_id` >= ?))) AND (((`h_id` <= ?))) AND (h_id > 900000) LIMIT 1000

[revin@forge ~]$ pt-archiver --set-vars=innodb_lock_wait_timeout=5 --txn-size=1000 --statistics --bulk-delete --limit=1000 --optimize --primary-key-only --purge --dry-run --source=h=localhost,S=/tmp/mysql_sandbox55320.sock,u=msandbox,p=msandbox,D=test,t=ft_history --where='h_id > 950000'
SELECT /*!40001 SQL_NO_CACHE */ `h_id`,`u_id`,`cn_id`,`f_id`,`h_date`,`h_ip`,`h_agent` FROM `test`.`ft_history` FORCE INDEX(`PRIMARY`) WHERE (h_id > 950000) AND (`h_id` < '1991667') LIMIT 1000
SELECT /*!40001 SQL_NO_CACHE */ `h_id`,`u_id`,`cn_id`,`f_id`,`h_date`,`h_ip`,`h_agent` FROM `test`.`ft_history` FORCE INDEX(`PRIMARY`) WHERE (h_id > 950000) AND (`h_id` < '1991667') AND ((`h_id` >= ?)) LIMIT 1000
DELETE FROM `test`.`ft_history` WHERE (((`h_id` >= ?))) AND (((`h_id` <= ?))) AND (h_id > 950000) LIMIT 1000

[revin@forge ~]$ pt-archiver --set-vars=innodb_lock_wait_timeout=5 --txn-size=1000 --statistics --bulk-delete --limit=1000 --optimize --dry-run --purge --primary-key-only --source=h=localhost,S=/tmp/mysql_sandbox55320.sock,u=msandbox,p=msandbox,D=test,t=ft_history --where='h_id > 900000'
Started at 2013-07-09T22:12:30, ended at 2013-07-09T22:12:31
Source: D=test,S=/tmp/mysql_sandbox55320.sock,h=localhost,p=...,t=ft_history,u=msandbox
SELECT 19385
INSERT 0
DELETE 19385
Action Count Time Pct
bulk_deleting 20 1.0916 70.25
commit 20 0.1457 9.38
select 21 0.0165 1.06
other 0 0.3000 19.31

Related branches

lp:~percona-toolkit-dev/percona-toolkit/fix-pt-archiver-dry-run-bug-1199589
Daniel Nichter: Approve on 2013-08-13
lp:~percona-toolkit-dev/percona-toolkit/fix-option-parser-bug-1199589-2.1
Daniel Nichter: Approve on 2013-08-14
tags: added: risk
Changed in percona-toolkit:
status: New → Confirmed
Changed in percona-toolkit:
milestone: none → 2.2.5
importance: Undecided → Critical
summary: - pt-archiver Does not honor --dry-run if its not on the first 2 or last 2
- options before --source
+ pt-archiver deletes data despite --dry-run
Changed in percona-toolkit:
status: Confirmed → In Progress
assignee: nobody → Daniel Nichter (daniel-nichter)
Daniel Nichter (daniel-nichter) wrote :

One problem is that --optimize is not being used correctly: the option takes an argument: d, s, or ds (see --analyze). The real problem is that --optimize is consuming the next option, which is --dry-run in this case. This shouldn't happen; it means the option parser is failing to notice that --dry-run is not the string val to --optimize but rather an option; it should catch this and the tool should fail to start with an error like "--optimize requires a value".

Daniel Nichter (daniel-nichter) wrote :

Every string type opt in every tool is broken all the way back to Maatkit. For example, if --foo takes a string, and the command line is --foo --bar, then "--bar" becomes the value of --foo, so it's like --bar was never specified.

This probably explains many heisenbugs over the years. It's not made every tool fall apart mostly likely because options as vals to other options are usually invalid input. For example, something like "--input-file --foo" would probably fail with an error like "File --foo does not exist". But as this bug shows, not all options have good input validation, and the consequence can be bad if the wrong option (like --dry-run) is silently consumed.

This will be fixed in 2.2.5 and backported to 2.1.11 (so yes, there will be another 2.1 release).

Knowing this may help you detect such heisenbugs in the future, i.e. if the tool seems to be running oddly, not doing the right thing (when you're sure it should), then double-check that all string opts were actually given values.

tags: added: all-tools option-parsing
removed: pt-archiver
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers