pt-table-sync does not detect data difference using CRC32 hash
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Triaged
|
Undecided
|
Unassigned |
Bug Description
While syncing a slave to its master, I noticed some sets of rows are not getting synced. A table on the slave was loaded incorrectly and the timestamp field was different by 7 hours from the master. I ran the following command to do the initial sync:
pt-table-sync --execute h=dbslave01,
This reported back that about half the rows were replaced to fix the time discrepancies, but not all the rows. In troubleshooting the problem, I found an anomaly in the calculations performed by this query that is run by pt-table-sync:
SELECT
/*zappos.
8923 AS chunk_num, COUNT(*) AS cnt, COALESCE(
FROM `zappos`
WHERE (`style_image_id` >= '1429'
AND `style_image_id` < '1435'
) FOR UPDATE ;
Whenever this query would run against an even number of rows that all had the same time error, then it calculates the same checksum value. Here is a sample of the data I'm comparing with the CRC32 value for the row and the cumulative XOR value. As you can see every other row generates the same value for XOR. Note, there are other columns that went into the calculation of the CRC, but I did not include them since they are verified to be the same on master and slave
Data from master:
style_image_id updated_at CRC32 XOR
1429 10/19/2011 9:41 3952427013
1430 10/19/2011 9:41 407848744 4091152173
1431 10/19/2011 9:41 2677520393 1817034532
1432 10/19/2011 9:41 2468571109 4285454529
1433 10/19/2011 9:41 1208292545 3077295104
1434 10/19/2011 9:41 3915229246 1580622910
Data from slave:
style_image_id updated_at CRC32 XOR
1429 10/19/2011 16:41 2727937137
1430 10/19/2011 16:41 1363346268 4091152173
1431 10/19/2011 16:41 3600546941 625081168
1432 10/19/2011 16:41 3660522385 4285454529
1433 10/19/2011 16:41 17387701 4268198004
1434 10/19/2011 16:41 2689723466 1580622910
This problem does not seem to occur with MD5, but CRC32 is the default and may affect more users of the tool.
This is in version pt-table-sync 2.1.2
In my case the master is MySQL Percona 5.1.56-rel12.7-log and the slave is MySQL 5.5.25a-
I have a pair of 5.5.25a- rel27.1. 277.rhel6 servers that show similar symptoms, though I would have logged it under pt-table-checksum 2.1.2, not sync.