I think that this is just a case of our documentation-as-code being confusing to people looking at the source.
=item --max-lag
type: time; default: 1s; group: Throttle
The default: 1s in the docs is parsed and that value is used. An easy enough way to check that the defaults are all working is to grep the output of --help:
$ pt-table-checksum --help | grep ' --max-lag'
--max-lag=m Pause checksumming until all replicas' lag
--max-lag 1
>replication delays by several hundred or even thousands of seconds during the checksum event, but catches right back up after and going.
This is the expected behavior, according to the --max-lag docs:
Pause checksumming until all replicas' lag is less than this value. After each
checksum query (each chunk), pt-table-checksum looks at the replication lag of
all replicas to which it connects, using Seconds_Behind_Master. If any replica
is lagging more than the value of this option, then pt-table-checksum will sleep
for L<"--check-interval"> seconds, then check all replicas again. If you
specify L<"--check-slave-lag">, then the tool only examines that server for
lag, not all servers.
So I don't think that there's a bug here? Feel free to correct me, otherwise I'll close this in a week or so.
I think that this is just a case of our documentation- as-code being confusing to people looking at the source.
=item --max-lag
type: time; default: 1s; group: Throttle
The default: 1s in the docs is parsed and that value is used. An easy enough way to check that the defaults are all working is to grep the output of --help:
$ pt-table-checksum --help | grep ' --max-lag'
--max-lag=m Pause checksumming until all replicas' lag
--max-lag 1
>replication delays by several hundred or even thousands of seconds during the checksum event, but catches right back up after and going.
This is the expected behavior, according to the --max-lag docs:
Pause checksumming until all replicas' lag is less than this value. After each Behind_ Master. If any replica interval" > seconds, then check all replicas again. If you slave-lag" >, then the tool only examines that server for
checksum query (each chunk), pt-table-checksum looks at the replication lag of
all replicas to which it connects, using Seconds_
is lagging more than the value of this option, then pt-table-checksum will sleep
for L<"--check-
specify L<"--check-
lag, not all servers.
So I don't think that there's a bug here? Feel free to correct me, otherwise I'll close this in a week or so.