Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

MSR replication - Galera clash

Bug #1676464 reported by Przemek on 2017-03-27

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	New	Undecided	Unassigned
	5.7	Fix Committed	Undecided	Unassigned

Bug Description

Setting up multi-source replication on a PXC node results in permanent 'System lock' state for second channel SQL thread.

Test case:
* setup 2 standalone MySQL 5.7 instances
* setup 1 standalone PXC 5.7 node, with wsrep provider enabled
* setup replication from both standalone MySQL instances, using two channels, to the PXC node
* restart the slave PXC node

The slave node gets permanently blocked - not able to stop it gracefully, nor kill the locked SQL thread.

Example results:
mysql> pager egrep "Running|Channel|Err"
PAGER set to 'egrep "Running|Channel|Err"'
mysql> show slave status\G
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                   Last_Errno: 0
                   Last_Error:
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
                 Channel_Name: c3-c1
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                   Last_Errno: 0
                   Last_Error:
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
      Slave_SQL_Running_State: System lock
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
                 Channel_Name: c3-c2
2 rows in set (0.00 sec)

mysql> show processlist;
+----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 617 | wsrep: applier idle | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 617 | wsrep: aborter idle | NULL | 0 | 0 |
| 3 | system user | | NULL | Connect | 616 | Waiting for master to send event | NULL | 0 | 0 |
| 4 | system user | | NULL | Connect | 616 | Slave has read all relay log; waiting for more updates | NULL | 0 | 0 |
| 5 | system user | | NULL | Connect | 616 | Waiting for master to send event | NULL | 0 | 0 |
| 6 | system user | | NULL | Connect | 616 | System lock | NULL | 0 | 0 |
| 9 | root | localhost | NULL | Query | 0 | starting | show processlist | 0 | 0 |
+----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+-----------+---------------+
7 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.replication_connection_status\G
*************************** 1. row ***************************
             CHANNEL_NAME: c3-c1
               GROUP_NAME:
              SOURCE_UUID: f257ee64-0e06-11e7-b890-0242ac110002
                THREAD_ID: 55
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 12
LAST_HEARTBEAT_TIMESTAMP: 2017-03-27 14:18:26
RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: c3-c2
               GROUP_NAME:
              SOURCE_UUID: e893fd74-1090-11e7-9df2-0242ac150002
                THREAD_ID: 57
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 12
LAST_HEARTBEAT_TIMESTAMP: 2017-03-27 14:18:26
RECEIVED_TRANSACTION_SET: 125ad4e4-ef6f-ee18-6575-649192f28012:1-2
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.threads where name like '%slave_sql'\G
*************************** 1. row ***************************
          THREAD_ID: 56
               NAME: thread/sql/slave_sql
               TYPE: FOREGROUND
     PROCESSLIST_ID: 4
   PROCESSLIST_USER: NULL
   PROCESSLIST_HOST: NULL
     PROCESSLIST_DB: NULL
PROCESSLIST_COMMAND: Connect
   PROCESSLIST_TIME: 589
  PROCESSLIST_STATE: Slave has read all relay log; waiting for more updates
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: 1
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: NULL
       THREAD_OS_ID: 16813
*************************** 2. row ***************************
          THREAD_ID: 58
               NAME: thread/sql/slave_sql
               TYPE: FOREGROUND
     PROCESSLIST_ID: 6
   PROCESSLIST_USER: NULL
   PROCESSLIST_HOST: NULL
     PROCESSLIST_DB: NULL
PROCESSLIST_COMMAND: Connect
   PROCESSLIST_TIME: 589
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: 1
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: NULL
       THREAD_OS_ID: 16815
2 rows in set (0.00 sec)

mysql> stop slave for channel 'c3-c1';
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave for channel 'c3-c2';
... hangs

Tags: