pt-table-checksum doesn't wait for checksum table to replicate
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Fix Released
|
High
|
Daniel Nichter |
Bug Description
I've observed nondeterministic behavior in this test:
[baron@localhost stabilize-
# Stopping/
1..3
ok 1 - Ran without InnoDB (bug 996110)
ok 2 - 0 exit status (bug 996110)
# Shutting down sandboxes
ok 3 - Sandbox servers
[baron@localhost stabilize-
# Stopping/
1..3
ok 1 - Ran without InnoDB (bug 996110)
not ok 2 - 0 exit status (bug 996110)
# Failed test '0 exit status (bug 996110)'
# in t/pt-table-
# got: '1'
# expected: '0'
# Shutting down sandboxes
ok 3 - Sandbox servers
# Looks like you failed 1 test of 3.
[baron@localhost stabilize-
# Stopping/
1..3
ok 1 - Ran without InnoDB (bug 996110)
ok 2 - 0 exit status (bug 996110)
# Shutting down sandboxes
ok 3 - Sandbox servers
When I saved the output of test 2 to a file in /tmp, I found a lot of the following:
06-04T09:01:26 2 0 0 1 0 0.008 mysql.time_
06-04T09:01:26 Error waiting for the last checksum of table mysql.time_
Check that the replica is running and has the replicate table `percona`
06-04T09:01:26 Error checksumming table mysql.time_
This hints to me that pt-table-checksum needs to do something smarter. After creating the checksum table, it needs to wait until this table appears on all of the replicas it's detected. In addition, we need to fix the error on line 7533.
Related branches
- Daniel Nichter: Approve
-
Diff: 282 lines (+138/-33)5 files modifiedbin/pt-table-checksum (+40/-2)
lib/PerconaTest.pm (+14/-1)
t/pt-table-checksum/progress.t (+35/-2)
t/pt-table-checksum/samples/dsn-table.sql (+15/-0)
t/pt-table-checksum/throttle.t (+34/-28)
Changed in percona-toolkit: | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Daniel Nichter (daniel-nichter) |
milestone: | none → 2.1.2 |
tags: | added: breaks-replication pt-table-checksum risk |
summary: |
- pt-query-digest doesn't wait for checksum table to replicate + pt-table-checksum doesn't wait for checksum table to replicate |
tags: |
added: replication-wait removed: breaks-replication risk |
tags: |
added: breaks-replication replication-lag risk removed: replication-wait |
Changed in percona-toolkit: | |
status: | In Progress → Fix Committed |
Changed in percona-toolkit: | |
status: | Fix Committed → Fix Released |
This bug is not Fix Committed until lp:~percona-toolkit-dev/percona-toolkit/stabilize-test-suite is merged into lp:percona-toolkit/2.1