--replicate-check doesn't look at @ARGV

Reported by Alfie John on 2011-10-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit
Undecided
Unassigned
1.0
Undecided
Unassigned
2.0
Undecided
Unassigned

Bug Description

--replicate-check by default will find slaves by:

  - SHOW PROCESSLIST
  - SHOW SLAVE HOSTS

But if I'm not using MySQL's built-in replication, slaves don't show up on either commands. However slaves are already being specified at the end of @ARGV. Why can't --replicate-check use these values to connect and do the checking?

The only problem I see is that @ARGV can only specify one level of the replication network and so it can't recurse. But that is all I need for now :)

In the upcoming release of pt-table-checksum, you can use
--recursion-method=dsn=D=<database>,t=<table> to specify a table of
DSNs. Take a look at the code if you wish and see if it does what you
need.

Alfie John (alfiejohn) wrote :

In get_cxn_from_dsn_table(), does that mean that each slave in the replication network has a different set of DSNs in its 'dsn_table' table?

Alfie John (alfiejohn) wrote :

Not sure if this was intended - since the 'SELECT dsn' doesn't filter out $args{dsn_table_dsn}, it too will be included in the result set (i.e. will think that it is a slave to itself).

tags: added: finding-slaves pt-table-checksum
Changed in percona-toolkit:
importance: Undecided → Medium
Changed in percona-toolkit:
importance: Medium → Undecided
Baron Schwartz (baron-xaprb) wrote :

Let's just close this. This applies only to pt-table-checksum version 1 and is irrelevant for 2.x and newer.

Changed in percona-toolkit:
status: New → Won't Fix
Alfie John (alfiejohn) wrote :

I've looked at v2 and it looks like this might still matter.

If pt-table-checksum is run on master M1 but later M2 is made the master, the 'dsn' table will need to be updated to reflect the change in topology. Otherwise when pt-table-checksum is run on M2, it will see itself in the 'dsn' table as a slave and not see anything mentioned about M1.

But the best option IMHO is to have all databases (including all masters) in the 'dsn' table and have pt-table-checksum filter out the current master from the resultset.

Alfie John (alfiejohn) wrote :

I just noticed that this is happening in the non-DSN method recurse_to_slaves():

  my @slaves
    = grep { !$_->{master_id} || $_->{master_id} == $id } # Only my slaves.

Baron Schwartz (baron-xaprb) wrote :

Thanks for the added info (and sorry for the abruptness above -- I didn't mean to be rude).

I believe you are confirming that the master's DSN is filtered out as it should be. Did you find any places where it isn't?

Changed in percona-toolkit:
status: Won't Fix → New
Alfie John (alfiejohn) wrote :

No need to apologise... concise is good.

> I believe you are confirming that the master's DSN is filtered
> out as it should be. Did you find any places where it isn't?

Take a look at how get_cxn_from_dsn_table() gets slave DSNs:

  "SELECT dsn FROM $dsn_table ORDER BY id"

From what I can see, if the $dsn_table has all of the databases in the replication network, it will also find itself (i.e. the master). Unless I'm mistaken, I think it should be something like (ignoring the parent heirarchy stuff for now):

  $dbh->selectcol_arrayref( qq{
      SELECT dsn
      FROM $dsn_table
      WHERE dsn != ?
      ORDER BY id
    },
    undef,
    $master_dsn,
  );

Hopefully I'm making a bit more sense?

Daniel Nichter (daniel-nichter) wrote :

1.0 is no longer supported, so WontFix in that series, but a fix is still possible in 2.0+ (if this is still an issue; I haven't studied this bug yet).

Changed in percona-toolkit:
status: New → Triaged

Revisiting this bug, Alfie is correct that the tool selects all DSN from the DSN table, which can or may include the master. But I don't think this will cause an issue because the slave recursion process looks to see if the slave's server id has been seen before. If it connects to itself, it will see its own master server id and skip itself.

The other problem is: there's no easy way to filter out the master dsn in the SELECT, because these DSNs are logically equivalent for the same master:

F=my_masters_defaults_file.cnf
h=master.example.com
h=127.1
h=127.0.0.1

In other words, there's no canonical form of a DSN for a host, which makes filtering the DSN table not trivial.

So I'm going to close this bug, but if someone can provide a test case to the contrary (proving the problem), then I'll be happy to open the bug again.

Changed in percona-toolkit:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers