wsrep_mysql_replication_bundle cannot be safely used with replication (SLAVE STOP, shutting down mysqld, etc.)

Bug #1169329 reported by Jay Janssen
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.6
Confirmed
High
Unassigned

Bug Description

We really need this feature to be more bullet-proof to assist standard async to PXC migration:

[4/15/13 10:54:19 AM] Jay Janssen: how scary is wsrep_mysql_replication_bundle ?
[4/15/13 10:57:48 AM] Alexey Yurchenko / Codership: Jay, what is it?
[4/15/13 10:58:03 AM] Jay Janssen: https://bugs.launchpad.net/codership-mysql/+bug/1048492
[4/15/13 10:58:12 AM] Jay Janssen: looks like Seppo wrote it
[4/15/13 10:58:29 AM] Jay Janssen: I have an issue with a cluster as a slave that can't catch up in replication
[4/15/13 11:03:35 AM] Jay Janssen: and this seems like the only hope
[4/15/13 11:03:41 AM] Jay Janssen: just wondering how risky enabling this is
[4/15/13 11:10:24 AM] Alexey Yurchenko / Codership: it was tested with one customer and proved to wirk. overall it is very simple: several transactions are concatenated into one, so crashes are unlikely.
[4/15/13 11:10:46 AM] Jay Janssen: is there any issue with SLAVE STOP and so-forth?
[4/15/13 11:10:50 AM] Jay Janssen: should that "do the right thing"?
[4/15/13 11:10:55 AM] Jay Janssen: https://bugs.launchpad.net/codership-mysql/+bug/1048492/comments/2
[4/15/13 11:12:13 AM] Jay Janssen: or, maybe better -- how do I send a "void transaction"?
[4/15/13 11:13:18 AM] Alexey Yurchenko / Codership: well, that says it all... A void transaction? maybe create/drop a schema?
[4/15/13 11:13:50 AM] Jay Janssen: so if I stop slave, then I need to send # void transactions == this setting to ensure the buffer is flushed?
[4/15/13 11:13:54 AM] Alexey Yurchenko / Codership: But you also must make sure that the problem is in replication throughput.
[4/15/13 11:14:00 AM] Jay Janssen: well, it sure seems to be
[4/15/13 11:14:49 AM] Jay Janssen: getting over 2k replication ops per second with the async repilcation enabled
[4/15/13 11:14:54 AM] Jay Janssen: nothing seems to be bottlenecking
[4/15/13 11:15:01 AM] Jay Janssen: already set FLATC to 0, etc.
[4/15/13 11:15:27 AM] Jay Janssen: IST on a node can support up to 13k or so TPS in the replication thread
[4/15/13 11:15:45 AM] Jay Janssen: can you confirm this? "so if I stop slave, then I need to send # void transactions == this setting to ensure the buffer is flushed?"
[4/15/13 11:16:05 AM] Jay Janssen: just want to ensure I know what I'm getting into
[4/15/13 11:17:11 AM] Alexey Yurchenko / Codership: yes, I believe you're getting it right.
[4/15/13 11:17:26 AM] Alexey Yurchenko / Codership: is it a WAN cluster?
[4/15/13 11:17:41 AM] Jay Janssen: no
[4/15/13 11:18:02 AM] Jay Janssen: LAN, 2k tps single threaded = about .5ms commit latency
[4/15/13 11:18:06 AM] Jay Janssen: which is probably about right
[4/15/13 11:20:25 AM] Seppo Jaakola / Codership: wsrep_mysql_replication_bundle is experimental still, the open issue is that channel must be manually flushed by idle transactions
[4/15/13 11:21:18 AM] Jay Janssen: those must come from the master?
[4/15/13 11:21:58 AM] Seppo Jaakola / Codership: yes
[4/15/13 11:22:02 AM] Jay Janssen: oy
[4/15/13 11:22:13 AM] Jay Janssen: which is kinda tricky if we're way behind the master
[4/15/13 11:24:06 AM] Jay Janssen: Seppo: what do you typically use for a void trx?
[4/15/13 11:26:12 AM] Seppo Jaakola / Codership: I never needed to use such... This feature was implemented for one customer, and the idea is to complete the implementation for next release, so that uncommitted bundles will never remain
[4/15/13 11:26:44 AM] Seppo Jaakola / Codership: any activity, which will create mysql replication events will do
[4/15/13 11:27:09 AM] Jay Janssen: this only applies to writesets coming in via async replication, right?
[4/15/13 11:27:38 AM] Seppo Jaakola / Codership: yes, but they are not write sets, but replication events
[4/15/13 11:28:32 AM] Seppo Jaakola / Codership: slave thread in G node will wait until the n'th replication event, and then will commit one bundle. This commit will create one huge write set for G replication
[4/15/13 11:28:54 AM] Jay Janssen: what happens if I SLAVE STOP -- will the bundle go away?
[4/15/13 11:29:33 AM] Seppo Jaakola / Codership: donno, probably the pending transactions will vanish
[4/15/13 11:30:16 AM] Seppo Jaakola / Codership: with good luck, they will be committed, which wouild solve this issue...
[4/15/13 11:30:40 AM] Jay Janssen: hmm, so with RBR perhaps I can just rollback slave to exec_master_log_pos - <#val of wsrep_mysql_replication_bundle> to be safe
[4/15/13 11:32:11 AM] Seppo Jaakola / Codership: this SLAVE STOP behavior is easy to test. Just send one replication event and stop slave, and check if G nodes see anything about the event
[4/15/13 11:32:26 AM] Jay Janssen: not easy to test on a live system
[4/15/13 11:32:39 AM] Seppo Jaakola / Codership: indeed
[4/15/13 11:32:44 AM] Jay Janssen: but here we are
[4/15/13 11:46:10 AM] Seppo Jaakola / Codership: quick experiment shows that stop slave will not cause pending replicationb events to be committed
[4/15/13 11:46:46 AM] Seppo Jaakola / Codership: however, they are not lost either, when slave is started and more events are replicated, all pending events will be committed
[4/15/13 11:47:53 AM] Seppo Jaakola / Codership: treating DDL has strange bug: any DDL comig from replication stream will commit all events and disable replication event bundling in the future. This condition will hold until slave is stopped and started again
[4/15/13 11:48:48 AM] Seppo Jaakola / Codership: so, you can flush the replication bundle by issuing e.g. "create database dummy' in the master
[4/15/13 11:51:46 AM] Jay Janssen: thanks, that is helpful

Tags: i60821
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

So there are 2 other issues: (from bugs I have marked duplicate)

 SQL_SLAVE_SKIP_COUNTER=1 sticks permanently with wsrep_mysql_replication_bundle set

 PXC node as async slave has issues updating exec_master_log_pos and relay_master_log_file

All of them can be traced to wsrep_mysql_replication_bundle set

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

https://gist.github.com/ronin13/3e944f259e7be2bb4d32

might fix it.

The fix takes into account

a) Slave SQL Skip counter

The reason why it was failing with SQL_SLAVE_SKIP_COUNTER was because:

in apply_event_and_update_pos the WSREP code was overwriting the reason with Log_event::EVENT_SKIP_IGNORE even when it was Log_event::EVENT_SKIP_COUNT.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Keeping " SQL_SLAVE_SKIP_COUNTER=1 sticks permanently with wsrep_mysql_replication_bundle set" issue separate. Unduplicating it.

Revision history for this message
Przemek (pmalkowski) wrote :

I can confirm another problem - when GTID is used, the slave using wsrep_mysql_replication_bundle > 0, fails with error:

2015-10-19 12:10:16 20849 [ERROR] Slave SQL: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is '12446bf7-3219-11e5-9434-080027079e3d:3707'. Error_code: 1837
2015-10-19 12:10:16 20849 [Warning] Slave: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is '12446bf7-3219-11e5-9434-080027079e3d:3707'. Error_code: 1837
2015-10-19 12:10:16 20849 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'binlog.000013' position 10657452

I have used this setting:

percona1 mysql> show variables like 'wsrep_mysql%';
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| wsrep_mysql_replication_bundle | 50 |
+--------------------------------+-------+
1 row in set (0.01 sec)

percona1 mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And normal sysbench test:
[root@db1 ~]# sysbench --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=test1 --mysql-user=root --oltp-table-size=6000 --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-tables-count=4 run
sysbench 0.5: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
...

tags: added: i60821
Revision history for this message
Przemek (pmalkowski) wrote :

Btw, in my test case, I was using async master running also 5.6.26 and binlog_format was ROW.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-956

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.