PXC node as async slave has issues updating exec_master_log_pos and relay_master_log_file
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Invalid
|
Undecided
|
Unassigned | |||
5.6 |
Invalid
|
Undecided
|
Unassigned |
Bug Description
This is the second customer using PXC that I've seen this on, clearly a PXC/codership-mysql bug:
So, basically I have a 3 node cluster with 1 node acting as an async slave of some other master. In this case the master is 5.5.24 PS, and the PXC cluster is Percona-
The symptoms are:
- slave threads are clearly running
- replication traffic on cluster confirms this
- pt-heartbeat from async master is updating
However, these variables in SHOW SLAVE STATUS get stuck at some point and stop updating. This doesn't happen immediately after starting the slave:
mysql> select * from percona.heartbeat; show slave status\G select sleep( 10 ); select * from percona.heartbeat; show slave status\G
+------
| ts | server_id | file | position | relay_master_
+------
| 2013-04-
+------
1 row in set (0.00 sec)
*******
+-------------+
| sleep( 10 ) |
+-------------+
| 0 |
+-------------+
1 row in set (10.03 sec)
+------
| ts | server_id | file | position | relay_master_
+------
| 2013-04-
+------
1 row in set (0.00 sec)
*******
1 row in set (0.00 sec)
Clearly the heartbeat table (generated by pt-heartbeat on the master) is updating, but SHOW SLAVE STATUS is not for:
Relay_Master_
Exec_Master_Log_Pos
Relay_Log_File
Relay_Log_Pos
Additionally, I can see that while it's very unlikely that I'm on Relay_Log_File: tabasco-
I've seen this twice now on real production servers. Both times were cases where there was a big replication lag to catch up from.
This seems similar to : https:/ /bugs.launchpad .net/percona- server/ +bug/860910
Though that was fixed in PS 5.5.17.