pt-table-checksum ignores its default and explicit --recursion-method

Bug #953141 reported by Fernando Ipar
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Brian Fraser

Bug Description

Here is the invocation:

./pt-table-checksum --ask-pass --lock-wait-time=120 u=<user>

The servers are 5.0 (5.0.84sp1-enterprise-gpl-log)

The master's slaves are discovered properly, but when it tries to discover the slaves of the first found slave, it attempts to run SHOW SLAVE HOSTS (some output edited to remove user names and IP addresses):

# MasterSlave:2559 8866 Looking for slaves on h=localhost,p=...,u=<user> using methods processlist hosts
# MasterSlave:2566 8866 Finding slaves with _find_slaves_by_processlist
# VersionParser:1677 8866 5.0.84sp1-enterprise-gpl-log parses to 005000084
# VersionParser:1677 8866 4.1.2 parses to 004001002
# VersionParser:1688 8866 005000084 ge 4.1.2 : 1
# MasterSlave:2633 8866 DBI::db=HASH(0x12d98a50) SHOW GRANTS FOR CURRENT_USER()
# MasterSlave:2663 8866 DBI::db=HASH(0x12d98a50) SHOW PROCESSLIST
# DSNParser:78 8866 Parsing h=<Slave84>
# DSNParser:96 8866 Finding value for S
# DSNParser:106 8866 Copying value for S from defaults
# DSNParser:96 8866 Finding value for F
# DSNParser:106 8866 Copying value for F from defaults
# DSNParser:96 8866 Finding value for A
# DSNParser:106 8866 Copying value for A from defaults
# DSNParser:96 8866 Finding value for P
# DSNParser:106 8866 Copying value for P from defaults
# DSNParser:96 8866 Finding value for p
# DSNParser:102 8866 Copying value for p from previous DSN
# DSNParser:96 8866 Finding value for u
# DSNParser:102 8866 Copying value for u from previous DSN
# DSNParser:96 8866 Finding value for h
# DSNParser:96 8866 Finding value for D
# DSNParser:106 8866 Copying value for D from defaults
# DSNParser:96 8866 Finding value for t
...
(more output supressed)
...
# MasterSlave:2571 8866 Found 6 slaves
# MasterSlave:2537 8866 Recursing from h=localhost,p=...,u=<user> to h=<Slave84>,p=...,u=<user>
...
(more output supressed)
...
# MasterSlave:2559 8866 Looking for slaves on h=<Slave84>,p=...,u=<user> using methods processlist hosts
# MasterSlave:2566 8866 Finding slaves with _find_slaves_by_processlist
# VersionParser:1677 8866 5.0.84sp1-enterprise-gpl-log parses to 005000084
# VersionParser:1677 8866 4.1.2 parses to 004001002
# VersionParser:1688 8866 005000084 ge 4.1.2 : 1
# MasterSlave:2633 8866 DBI::db=HASH(0x129db960) SHOW GRANTS FOR CURRENT_USER()
# MasterSlave:2663 8866 DBI::db=HASH(0x129db960) SHOW PROCESSLIST
# MasterSlave:2566 8866 Finding slaves with _find_slaves_by_hosts
# MasterSlave:2600 8866 DBI::db=HASH(0x129db960) SHOW SLAVE HOSTS
DBD::mysql::db selectall_arrayref failed: Access denied; you need the REPLICATION SLAVE privilege for this operation [for Statement "SHOW SLAVE HOSTS"] at ./pt-table-checksum line 2601, <STDIN> line 1.

Related branches

Revision history for this message
Fernando Ipar (fipar) wrote :

The same things happen if --recursion-method=processlist is set explicitly.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

For reference, this is related to Percona customer issue 22055.

tags: added: percona-22055 pt-table-checksum slave-recursion
Changed in percona-toolkit:
status: New → Triaged
summary: - pt-table-checksum uses SHOW SLAVE HOSTS to discover slaves of slaves,
- even when --recursion-method is left at its default of processlist
+ pt-table-checksum ignores its default and explicit --recursion-method
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

--recursion-method is magical; see MasterSlave::find_slave_hosts(). It tries multiple methods. The docs for tools say that "processlist" is the default: "The processlist method is the default, because SHOW SLAVE HOSTS is not reliable." That's not entirely true: the tool tries both, it just tries processlist first. If the method is set explicitly via the command line option, it should only try the given method.

tags: added: all-tools
removed: pt-table-checksum
Brian Fraser (fraserbn)
Changed in percona-toolkit:
assignee: nobody → Brian Fraser (fraserbn)
importance: Undecided → Medium
Revision history for this message
Brian Fraser (fraserbn) wrote :

We are currently thinking how to best fix this, but getting the behavior that Fernando needed took surprisingly little:

=== modified file 'bin/pt-table-checksum'
--- bin/pt-table-checksum 2012-03-02 15:51:28 +0000
+++ bin/pt-table-checksum 2012-03-19 17:30:07 +0000
@@ -2549,8 +2549,7 @@

    my @methods = qw(processlist hosts);
    if ( $method ) {
- @methods = grep { $_ ne $method } @methods;
- unshift @methods, $method;
+ @methods = $method;
    }
    else {
       if ( ($dsn->{P} || 3306) != 3306 ) {

To expand a bit on that, currently, if you use '--recursion-method=processlist' or =host, it won't disable the other; it'll just rearrange the order so that whichever was passed down is tried first. The problem here lies in that, if the method you passed in doesn't return slaves, it'll try the other way, which you might not have priviledges to use.
The only big determent to fixing this is that, by using --recursion-method, there would be no way to explicitly set the current behavior.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

The right solution to this is probably to make --recursion-method an array and set its default value to 'processlist,host'. Or will that introduce other subtleties?

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

toolkit $ pt-table-checksum --help | grep recursion
  --recursion-method=s Preferred recursion method for discovering
  --recursion-method (No value)

The default value is hard-coded, not done the normal way in the docs. Brian and I think that an array will fix this without a lot of risk but he'll look into that further.

Changed in percona-toolkit:
milestone: none → 2.1.3
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: Triaged → In Progress
Changed in percona-toolkit:
status: In Progress → Fix Committed
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-498

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.