--replicate-check doesn't look at @ARGV

Bug #879194 reported by Alfie John
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Invalid
Undecided
Unassigned
1.0
Won't Fix
Undecided
Unassigned
2.0
Invalid
Undecided
Unassigned

Bug Description

--replicate-check by default will find slaves by:

  - SHOW PROCESSLIST
  - SHOW SLAVE HOSTS

But if I'm not using MySQL's built-in replication, slaves don't show up on either commands. However slaves are already being specified at the end of @ARGV. Why can't --replicate-check use these values to connect and do the checking?

The only problem I see is that @ARGV can only specify one level of the replication network and so it can't recurse. But that is all I need for now :)

Revision history for this message
Baron Schwartz (baron-xaprb) wrote : Re: [Bug 879194] [NEW] --replicate-check doesn't look at @ARGV

In the upcoming release of pt-table-checksum, you can use
--recursion-method=dsn=D=<database>,t=<table> to specify a table of
DSNs. Take a look at the code if you wish and see if it does what you
need.

Revision history for this message
Alfie John (alfiejohn) wrote :

In get_cxn_from_dsn_table(), does that mean that each slave in the replication network has a different set of DSNs in its 'dsn_table' table?

Revision history for this message
Alfie John (alfiejohn) wrote :

Not sure if this was intended - since the 'SELECT dsn' doesn't filter out $args{dsn_table_dsn}, it too will be included in the result set (i.e. will think that it is a slave to itself).

tags: added: finding-slaves pt-table-checksum
Changed in percona-toolkit:
importance: Undecided → Medium
Changed in percona-toolkit:
importance: Medium → Undecided
Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Let's just close this. This applies only to pt-table-checksum version 1 and is irrelevant for 2.x and newer.

Changed in percona-toolkit:
status: New → Won't Fix
Revision history for this message
Alfie John (alfiejohn) wrote :

I've looked at v2 and it looks like this might still matter.

If pt-table-checksum is run on master M1 but later M2 is made the master, the 'dsn' table will need to be updated to reflect the change in topology. Otherwise when pt-table-checksum is run on M2, it will see itself in the 'dsn' table as a slave and not see anything mentioned about M1.

But the best option IMHO is to have all databases (including all masters) in the 'dsn' table and have pt-table-checksum filter out the current master from the resultset.

Revision history for this message
Alfie John (alfiejohn) wrote :

I just noticed that this is happening in the non-DSN method recurse_to_slaves():

  my @slaves
    = grep { !$_->{master_id} || $_->{master_id} == $id } # Only my slaves.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Thanks for the added info (and sorry for the abruptness above -- I didn't mean to be rude).

I believe you are confirming that the master's DSN is filtered out as it should be. Did you find any places where it isn't?

Changed in percona-toolkit:
status: Won't Fix → New
Revision history for this message
Alfie John (alfiejohn) wrote :

No need to apologise... concise is good.

> I believe you are confirming that the master's DSN is filtered
> out as it should be. Did you find any places where it isn't?

Take a look at how get_cxn_from_dsn_table() gets slave DSNs:

  "SELECT dsn FROM $dsn_table ORDER BY id"

From what I can see, if the $dsn_table has all of the databases in the replication network, it will also find itself (i.e. the master). Unless I'm mistaken, I think it should be something like (ignoring the parent heirarchy stuff for now):

  $dbh->selectcol_arrayref( qq{
      SELECT dsn
      FROM $dsn_table
      WHERE dsn != ?
      ORDER BY id
    },
    undef,
    $master_dsn,
  );

Hopefully I'm making a bit more sense?

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

1.0 is no longer supported, so WontFix in that series, but a fix is still possible in 2.0+ (if this is still an issue; I haven't studied this bug yet).

Changed in percona-toolkit:
status: New → Triaged
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Revisiting this bug, Alfie is correct that the tool selects all DSN from the DSN table, which can or may include the master. But I don't think this will cause an issue because the slave recursion process looks to see if the slave's server id has been seen before. If it connects to itself, it will see its own master server id and skip itself.

The other problem is: there's no easy way to filter out the master dsn in the SELECT, because these DSNs are logically equivalent for the same master:

F=my_masters_defaults_file.cnf
h=master.example.com
h=127.1
h=127.0.0.1

In other words, there's no canonical form of a DSN for a host, which makes filtering the DSN table not trivial.

So I'm going to close this bug, but if someone can provide a test case to the contrary (proving the problem), then I'll be happy to open the bug again.

Changed in percona-toolkit:
status: Triaged → Invalid
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-887

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.