Percona Toolkit

Comment 8 for bug 978432

Daniel Nichter (daniel-nichter) wrote :

The problem was more than the tool just ignoring the primary key, it was related to the row estimate (which, yes, seems unrelated). The tool does "EXPLAIN SELECT * FROM table WHERE 1=1" to get the row estimate and also the MySQL-chosen index for that query, but that query is not representative of the actual nibbling query (the query that ascends the table, not the one that selects the rows to checksum), so in cases like this, what's a good index for "SELECT * FROM table WHERE 1=1" may not be a good index for the more complex nibbling query.

So the new code does EXPLAIN for the actual nibbling query and if MySQL chooses a different index than the tool originally chose, then the MySQL index is used because MySQL should know best. I know the two indexes (the tool's and MySQL's) can differ when there's a --where clause that the tool can't account for but MySQL can. They will probably also differ in more complex scenarios with various MySQL optimizer tricks that the tool can't account for.

The original pt-table-checksum did this--used MySQL's chosen index--and the new pt-table-checksum does too but in a simplistic manner which doesn't work in cases like this where the PK is not always MySQL's preferred choice.