pt-slave-restart not reliable with GTIDs

Bug #1325871 reported by Stéphane Combaudon
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
High
Frank Cizmich

Bug Description

I'm using pt-slave-restart 2.2.8.

How to reproduce:

1) Set up a master (A) and a slave (B) using GTID-based replication

2) On the slave in the test database, create a table like this:
mysql> create table t (id int not null auto_increment primary key);

3) On the master in the test database, create the same table, but with if not exists:
mysql> create table if not exists t (id int not null auto_increment primary key);

4) Make the slave the new master:
B> stop slave; reset slave all;
A> change master to master_host='slave', ..., master_auto_position=1;

Replication is now broken on A because of the errant transaction created at step #2.

5) Insert a new row in t on the new master:
B> insert into test.t (id) values (null);

6) Run pt-slave-restart on the slave:
A# pt-slave-restart h=127.0.0.1,u=root,P=13001
2014-06-03T10:39:10 P=13001,h=127.0.0.1,u=root wheezy-relay-bin.000002 408 1050
Use of uninitialized value $gtid_exec_ids in substitution (s///) at /usr/bin/pt-slave-restart line 5070

The output of SHOW SLAVE STATUS shows that replication is still broken and that the offending transaction has not been skipped.

Changed in percona-toolkit:
importance: Undecided → High
assignee: nobody → Frank Cizmich (frank-cizmich)
tags: added: pt-slave-restart
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

Thanks for the detailed report Stéphane!
I followed your instructions but I was unable to reproduce this using MySQL 5.6
Every time pt-slave-restart worked and restarted the slave (previously master)
Are you using MySQL?
Can you run pt-slave-restart with PTDEBUG environment variable set and post results?
Thanks!

Revision history for this message
Kenny Gryp (gryp) wrote :

Hi Stéphane,

can you give me:

- PTDEBUG output
- output of SHOW SLAVE STATUS when it was broken before running pt-slave-restart.

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

Hi Kenny,

PTDEBUG output in attachment

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

and show slave status output

Revision history for this message
Kenny Gryp (gryp) wrote :

I see what's going on here...

the master_uuid, which is used to determine the statement to skip, is

                  Master_UUID: bbfcc5e1-ebef-11e3-855d-9cebe8067a3f

However, when we look at executed_gtid_set to find the next entry for that master_uuid, so we can skip that one:

            Executed_Gtid_Set: bc0c09e0-ebef-11e3-855d-9cebe8067a3f:1-2

It fails.. because the master didn't execute anything at all according to SHOW SLAVE STATUS. It only contains gtid's from the slave itself.

This make sense as there has been no successful transaction from the master_uuid. The first statement it tries to run breaks replication already.

This is a special case. How should we handle it? Is there any way to know which GTID exactly has failed? (It could be from the master's master, which is documented already)

If we can't, then we need to abort pt-slave-restart, saying the slave has not yet executed any trx from it's master (``master_uuid``).

Or maybe we can... Would there be anything wrong in checking ``Retrieved_Gtid_Set`` and see that 1-2 from master_uuid was fetched, but not yet executed? Can we be sure?

I do hope we all agree that GTID's are broken when it comes to visibility :) (not to mention the visibility when using multiple slave threads)

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

This is indeed a special case, but I think it's a simple one...

Here you can use gtid_subtract() to see which transactions have been executed on the master and have not been executed on the slave. And then you know which one to skip.

Also I believe (not 100% sure though) that using gtid_subtract() would allow you to skip transactions even if you have several levels of replication and without using the --master-uuid option.

Revision history for this message
Kenny Gryp (gryp) wrote :

Stéphane, thanks, I didn't know about this option.

So it looks like we could use GTID_SUBTRACT() for all Retrieved_gtid_set(), then see if there is only gtid's not executed from only 1 uuid. If so, we can skip that one.
In that case --master-uuid is not useful and we can know which one to skip.

However... if there were multiple uuids that have new transactions, I think we still don't know which one to skip.

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

Kenny, you're right.

There are at least 2 scenarios where you can have multiple master uuids in Retrieved_gtid_set:

- When parallel replication is enabled. I'm not sure how to handle this situation. Documentation says it's not supported and that may be acceptable for now.
- When there are errant transactions. They are quite dangerous anyway, so at a minimum the tool should mention their existence and suggest to retry when they are skipped. Ideally we could create a pt-errant-trx-killer tool because finding errant transactions is not always easy.

Revision history for this message
Kenny Gryp (gryp) wrote :

I could not find any way on how do deal with parallel replication in pt-slave-restart. That's why I made sure that pt-slave-restart is not supported and documented it carefully in it's limitations.

And yes, errant transactions are a potential problem, but sometimes in multiple tier replication, writes to the intermediate slave happen as well, then it's valid to have different uuid's.
So instead of warning on them, I implemented that by default the master_uuid is skipped and tried to document it properly.

Do you think it would be better to warn that there are multiple uuids? and then require the user of the tool to specify --i-know-I-have-multiple-uuids-and-will-take-the-risk-to-use-uuid=000-01051125abc-4161...
?

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

I agree that if there are writes to the intermediate slave, it's not possible to know which transaction you should skip.

One solution may be to have 2 modes:
- safe mode which will not skip any transaction if there are several candidates. The tool will simply exit with an appropriate error message
- unsafe mode where you skip all candidate transactions.

That's basically what you were suggesting but without the need to specify a master-uuid?

Your thoughts?

Revision history for this message
Kenny Gryp (gryp) wrote :

I'm good with that proposal. Is quite safe. Not sure what Daniel and the others of PTDEV think.
As unsafe mode I would recommend to specify a specific UUID, skipping all of them.

As I'm thinking now, what about this process:
- an environment with multiple 'masters'
- replication is broken
- we pick uuid number1 and skip it.
- start replication
- replication breaks immediately, no positions were changed
- undo skipping of that transaction from number1 (can we do this somehow?)
- move on to uuid number2, skip it
- start replication
- that seems to work fine.... we're good..

Maybe I should spend some time looking if that is possible (might take a bit of time as my schedule is often completely full)

Revision history for this message
Stéphane Combaudon (stephane-combaudon) wrote :

I'm not sure requiring the user to give the master uuid is a good idea: if you already know that, skipping the right transaction is very easy and you don't need pt-slave-restart.

IMO, you use pt-slave-restart if you don't know how to identify the transaction that breaks replication or if you have to many errors to skip them manually. In both cases, potentially having to specify a master uuid is a hassle.

In unsafe mode, I think the tool should try to skip the 1st transaction of each master uuid until replication can restart. Then it's unsafe because the tool may skip more transactions than needed.

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

As per discussion with Frank, pt-slave-restart is unsafe so being deprecated and will be removed from Percona Toolkit 3.0

Changed in percona-toolkit:
status: New → Invalid
Changed in percona-toolkit:
status: Invalid → Won't Fix
Revision history for this message
Matt Griffin (mattgriffin) wrote :

Frank: Let's discuss this.

Changed in percona-toolkit:
status: Won't Fix → Confirmed
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-375

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.