Regression in GTID consistency when parallel applying used for async replication

Bug #1681831 reported by Przemek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Released
Undecided
Unassigned
5.7
Fix Released
Undecided
Krunal Bauskar

Bug Description

After upgrade from PXC 5.7.16 to 5.7.17, parallel async slave leads to inconsistency in GTIDs on the cluster peers. Events replicated by the async slave node get assigned the Galera's cluster UUID instead of the original async master's UUIDs. This only happens when slave_parallel_worker>1.

Test case is simple - two nodes PXC cluster, where one of the nodes is slave of standalone master. Do couple of updates (to a different data set) on the PXC nodes and on the async master.
Example tests:

node1> select @@version,@@version_comment\G
*************************** 1. row ***************************
        @@version: 5.7.16-10-57-log
@@version_comment: Percona XtraDB Cluster (GPL), Release rel10, Revision bec0879, WSREP version 27.19, wsrep_27.19
1 row in set (0.00 sec)

node1> show global variables like 'slave_par%';
+------------------------+----------+
| Variable_name | Value |
+------------------------+----------+
| slave_parallel_type | DATABASE |
| slave_parallel_workers | 5 |
+------------------------+----------+
2 rows in set (0.00 sec)

master> show master status\G
*************************** 1. row ***************************
             File: mysql-binlog.000001
         Position: 1399
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: e893fd74-1090-11e7-9df2-0242ac150002:1-5
1 row in set (0.00 sec)

node1> show global variables like 'gtid_executed'\G
*************************** 1. row ***************************
Variable_name: gtid_executed
        Value: 5d8a1eab-ef88-ee18-4891-fd62d95ebf12:1-4,
e893fd74-1090-11e7-9df2-0242ac150002:1-5
1 row in set (0.00 sec)

node2> show global variables like 'gtid_executed'\G
*************************** 1. row ***************************
Variable_name: gtid_executed
        Value: 5d8a1eab-ef88-ee18-4891-fd62d95ebf12:1-4,
e893fd74-1090-11e7-9df2-0242ac150002:1-5
1 row in set (0.00 sec)

node1> show status like 'wsrep_cluster_state_uuid';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | a275e154-1077-11e7-b76e-029d26a140ed |
+--------------------------+--------------------------------------+
1 row in set (0.01 sec)

-- same test in 5.7.17

master> show master status\G
*************************** 1. row ***************************
             File: mysql-binlog.000001
         Position: 901
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: e893fd74-1090-11e7-9df2-0242ac150002:1-3
1 row in set (0.00 sec)

node1> select @@version,@@version_comment\G
*************************** 1. row ***************************
        @@version: 5.7.17-11-57-log
@@version_comment: Percona XtraDB Cluster (GPL), Release rel11, Revision e2a7fdd, WSREP version 27.20, wsrep_27.20
1 row in set (0.00 sec)

node1> show global variables like 'gtid_executed'\G
*************************** 1. row ***************************
Variable_name: gtid_executed
        Value: 5d8a1eab-ef88-ee18-4891-fd62d95ebf12:1-4,
e893fd74-1090-11e7-9df2-0242ac150002:1-3
1 row in set (0.00 sec)

node2> show global variables like 'gtid_executed'\G
*************************** 1. row ***************************
Variable_name: gtid_executed
        Value: 5d8a1eab-ef88-ee18-4891-fd62d95ebf12:1-7
1 row in set (0.00 sec)

Tags: i181114
Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

commit 13a2ed432a23ed51203df9d158a0316ea0f427c6
Author: Krunal Bauskar <email address hidden>
Date: Wed May 3 11:01:21 2017 +0530

    - PXC#816: Regression in GTID consistency when parallel applying used for async replication

      - GTID event generated from MASTER is processed by SLAVE.
      - SLAVE can process this event using master-thread if slave-parallel-workers=0
        or assign it to slave worker is slave-parallel-workers > 0.
      - Starting 5.7, MySQL has stopped writing this event to bin-log but PXC need
        this event for proper creation of write-set so PXC has logic to cache this event.
      - Hook to cache this event was wrongly placed that was triggered only if
        GTID event is being processed by master-thread (that is slave-parallel-workers=0)

      Fix:
      - Corrected hook such that the thread that is processing this event will
        cache the event.

Changed in percona-xtradb-cluster:
status: New → Fix Committed
Changed in percona-xtradb-cluster:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-816

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.