wsrep_mysql_replication_bundle cannot be safely used with replication (SLAVE STOP, shutting down mysqld, etc.)

Bug #1169329 reported by Jay Janssen on 2013-04-15
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.6
Confirmed
High
Unassigned

Bug Description

We really need this feature to be more bullet-proof to assist standard async to PXC migration:

[4/15/13 10:54:19 AM] Jay Janssen: how scary is wsrep_mysql_replication_bundle ?
[4/15/13 10:57:48 AM] Alexey Yurchenko / Codership: Jay, what is it?
[4/15/13 10:58:03 AM] Jay Janssen: https://bugs.launchpad.net/codership-mysql/+bug/1048492
[4/15/13 10:58:12 AM] Jay Janssen: looks like Seppo wrote it
[4/15/13 10:58:29 AM] Jay Janssen: I have an issue with a cluster as a slave that can't catch up in replication
[4/15/13 11:03:35 AM] Jay Janssen: and this seems like the only hope
[4/15/13 11:03:41 AM] Jay Janssen: just wondering how risky enabling this is
[4/15/13 11:10:24 AM] Alexey Yurchenko / Codership: it was tested with one customer and proved to wirk. overall it is very simple: several transactions are concatenated into one, so crashes are unlikely.
[4/15/13 11:10:46 AM] Jay Janssen: is there any issue with SLAVE STOP and so-forth?
[4/15/13 11:10:50 AM] Jay Janssen: should that "do the right thing"?
[4/15/13 11:10:55 AM] Jay Janssen: https://bugs.launchpad.net/codership-mysql/+bug/1048492/comments/2
[4/15/13 11:12:13 AM] Jay Janssen: or, maybe better -- how do I send a "void transaction"?
[4/15/13 11:13:18 AM] Alexey Yurchenko / Codership: well, that says it all... A void transaction? maybe create/drop a schema?
[4/15/13 11:13:50 AM] Jay Janssen: so if I stop slave, then I need to send # void transactions == this setting to ensure the buffer is flushed?
[4/15/13 11:13:54 AM] Alexey Yurchenko / Codership: But you also must make sure that the problem is in replication throughput.
[4/15/13 11:14:00 AM] Jay Janssen: well, it sure seems to be
[4/15/13 11:14:49 AM] Jay Janssen: getting over 2k replication ops per second with the async repilcation enabled
[4/15/13 11:14:54 AM] Jay Janssen: nothing seems to be bottlenecking
[4/15/13 11:15:01 AM] Jay Janssen: already set FLATC to 0, etc.
[4/15/13 11:15:27 AM] Jay Janssen: IST on a node can support up to 13k or so TPS in the replication thread
[4/15/13 11:15:45 AM] Jay Janssen: can you confirm this? "so if I stop slave, then I need to send # void transactions == this setting to ensure the buffer is flushed?"
[4/15/13 11:16:05 AM] Jay Janssen: just want to ensure I know what I'm getting into
[4/15/13 11:17:11 AM] Alexey Yurchenko / Codership: yes, I believe you're getting it right.
[4/15/13 11:17:26 AM] Alexey Yurchenko / Codership: is it a WAN cluster?
[4/15/13 11:17:41 AM] Jay Janssen: no
[4/15/13 11:18:02 AM] Jay Janssen: LAN, 2k tps single threaded = about .5ms commit latency
[4/15/13 11:18:06 AM] Jay Janssen: which is probably about right
[4/15/13 11:20:25 AM] Seppo Jaakola / Codership: wsrep_mysql_replication_bundle is experimental still, the open issue is that channel must be manually flushed by idle transactions
[4/15/13 11:21:18 AM] Jay Janssen: those must come from the master?
[4/15/13 11:21:58 AM] Seppo Jaakola / Codership: yes
[4/15/13 11:22:02 AM] Jay Janssen: oy
[4/15/13 11:22:13 AM] Jay Janssen: which is kinda tricky if we're way behind the master
[4/15/13 11:24:06 AM] Jay Janssen: Seppo: what do you typically use for a void trx?
[4/15/13 11:26:12 AM] Seppo Jaakola / Codership: I never needed to use such... This feature was implemented for one customer, and the idea is to complete the implementation for next release, so that uncommitted bundles will never remain
[4/15/13 11:26:44 AM] Seppo Jaakola / Codership: any activity, which will create mysql replication events will do
[4/15/13 11:27:09 AM] Jay Janssen: this only applies to writesets coming in via async replication, right?
[4/15/13 11:27:38 AM] Seppo Jaakola / Codership: yes, but they are not write sets, but replication events
[4/15/13 11:28:32 AM] Seppo Jaakola / Codership: slave thread in G node will wait until the n'th replication event, and then will commit one bundle. This commit will create one huge write set for G replication
[4/15/13 11:28:54 AM] Jay Janssen: what happens if I SLAVE STOP -- will the bundle go away?
[4/15/13 11:29:33 AM] Seppo Jaakola / Codership: donno, probably the pending transactions will vanish
[4/15/13 11:30:16 AM] Seppo Jaakola / Codership: with good luck, they will be committed, which wouild solve this issue...
[4/15/13 11:30:40 AM] Jay Janssen: hmm, so with RBR perhaps I can just rollback slave to exec_master_log_pos - <#val of wsrep_mysql_replication_bundle> to be safe
[4/15/13 11:32:11 AM] Seppo Jaakola / Codership: this SLAVE STOP behavior is easy to test. Just send one replication event and stop slave, and check if G nodes see anything about the event
[4/15/13 11:32:26 AM] Jay Janssen: not easy to test on a live system
[4/15/13 11:32:39 AM] Seppo Jaakola / Codership: indeed
[4/15/13 11:32:44 AM] Jay Janssen: but here we are
[4/15/13 11:46:10 AM] Seppo Jaakola / Codership: quick experiment shows that stop slave will not cause pending replicationb events to be committed
[4/15/13 11:46:46 AM] Seppo Jaakola / Codership: however, they are not lost either, when slave is started and more events are replicated, all pending events will be committed
[4/15/13 11:47:53 AM] Seppo Jaakola / Codership: treating DDL has strange bug: any DDL comig from replication stream will commit all events and disable replication event bundling in the future. This condition will hold until slave is stopped and started again
[4/15/13 11:48:48 AM] Seppo Jaakola / Codership: so, you can flush the replication bundle by issuing e.g. "create database dummy' in the master
[4/15/13 11:51:46 AM] Jay Janssen: thanks, that is helpful

So there are 2 other issues: (from bugs I have marked duplicate)

 SQL_SLAVE_SKIP_COUNTER=1 sticks permanently with wsrep_mysql_replication_bundle set

 PXC node as async slave has issues updating exec_master_log_pos and relay_master_log_file

All of them can be traced to wsrep_mysql_replication_bundle set

https://gist.github.com/ronin13/3e944f259e7be2bb4d32

might fix it.

The fix takes into account

a) Slave SQL Skip counter

The reason why it was failing with SQL_SLAVE_SKIP_COUNTER was because:

in apply_event_and_update_pos the WSREP code was overwriting the reason with Log_event::EVENT_SKIP_IGNORE even when it was Log_event::EVENT_SKIP_COUNT.

Keeping " SQL_SLAVE_SKIP_COUNTER=1 sticks permanently with wsrep_mysql_replication_bundle set" issue separate. Unduplicating it.

Przemek (pmalkowski) wrote :

I can confirm another problem - when GTID is used, the slave using wsrep_mysql_replication_bundle > 0, fails with error:

2015-10-19 12:10:16 20849 [ERROR] Slave SQL: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is '12446bf7-3219-11e5-9434-080027079e3d:3707'. Error_code: 1837
2015-10-19 12:10:16 20849 [Warning] Slave: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is '12446bf7-3219-11e5-9434-080027079e3d:3707'. Error_code: 1837
2015-10-19 12:10:16 20849 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'binlog.000013' position 10657452

I have used this setting:

percona1 mysql> show variables like 'wsrep_mysql%';
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| wsrep_mysql_replication_bundle | 50 |
+--------------------------------+-------+
1 row in set (0.01 sec)

percona1 mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And normal sysbench test:
[root@db1 ~]# sysbench --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=test1 --mysql-user=root --oltp-table-size=6000 --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-tables-count=4 run
sysbench 0.5: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
...

tags: added: i60821
Przemek (pmalkowski) wrote :

Btw, in my test case, I was using async master running also 5.6.26 and binlog_format was ROW.

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-956

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers