pt-table-checksum: recursion method default is not correct for clusters
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The documentation says that the default recursion method is "processlist,
The "cluster" recursion method also seems to be used.
This can be tested by running pt-table-checksum on:
- a 3 node PXC setup which uses MySQL Replication to replicate to another 3 node PXC setup.
- Then run pt-table-checksum against the node which has the master role in the MySQL Replication setup.
Expected result:
pt-table-checksum detects "regular" mysql replication with 1 master and 1 slave.
Actual result:
Cluster setup is detected
"xxx is a cluster node but no other nodes or regular replicas were found. Use --recursion-
According to PT_DEBUG output it does connect to the slave.
I can't upload PT_DEBUG output.
description: | updated |
tags: | added: pt-table-checksum pxc slave-recursion |
Changed in percona-toolkit: | |
status: | New → Confirmed |
milestone: | none → 2.2.3 |
Changed in percona-toolkit: | |
importance: | Undecided → Medium |
Changed in percona-toolkit: | |
milestone: | 2.2.4 → none |
assignee: | Daniel Nichter (daniel-nichter) → nobody |
importance: | Medium → Undecided |
status: | In Progress → Incomplete |
Daniel, given that message, the slave (i.e. the other cluster) was not found because if it was, it would work:
$ ./pt-table-checksum h=127.1, P=12345, u=msandbox, p=msandbox -d mysql
Not checking replica lag on lucid32 because it is a cluster node.
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
06-27T08:44:16 0 0 0 1 0 0.353 mysql.columns_priv
...
It might also die saying "these nodes are in another cluster:" if the slave's cluster name isn't the same as the master's cluster name. Since you can't send PTDEBUG output, could you double check it like "PTDEBUG=1 ... 2>&1 | grep MasterSlave" then look for a lines like:
# MasterSlave:5086 8584 Found 1 slaves h=127.1, p=...,u= msandbox to P=2900, h=127.0. 0.1,p=. ..,u=msandbox h=127.0. 0.1,p=. ..,u=msandbox h=127.0. 0.1,p=. ..,u=msandbox
# MasterSlave:5063 8584 Recursing from P=12345,
# MasterSlave:5004 8584 Port number is non-standard; using only hosts method
# MasterSlave:5020 8584 Recursion methods: hosts
# MasterSlave:5030 8584 Connected to P=2900,
# MasterSlave:5039 8584 SELECT @@SERVER_ID
# MasterSlave:5041 8584 Working on server ID 2900
# MasterSlave:4973 8584 Found slave: P=2900,
Port 1235 is my master (1st cluster), port 2900 is the slave (2nd cluster).