semi sync seems broken because of Group commit in Percona Server5.5
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Percona Server moved to https://jira.percona.com/projects/PS |
Invalid
|
Undecided
|
Unassigned | |
| 5.1 |
Invalid
|
Undecided
|
Unassigned | |
| 5.5 |
Triaged
|
Medium
|
Unassigned | |
| 5.6 |
Invalid
|
Undecided
|
Unassigned |
Bug Description
In function MYSQL_BIN_
quoted code in log.cc:
5903 if (RUN_HOOK(
5904 (leader->thd, log_file_name, log_file.
5905 {
After all transactions committed by the leader thread, other threads were waked up and after_commit was called
quoted code :
29 int Trans_delegate:
230 {
231 Trans_param param;
232 bool is_real_trans= (all || thd->transactio
233
234 param.flags = is_real_trans ? TRANS_IS_REAL_TRANS : 0;
235
236 Trans_binlog_info *log_info=
237 my_pthread_
238
239 param.log_file= log_info ? log_info->log_file : 0;
240 param.log_pos= log_info ? log_info->log_pos : 0;
241
242 int ret= 0;
243 FOREACH_
Here log_info is NULL if the current thread is not the leader of group commit, So these threads will never waiting for the ACK from slave.
zhai weixiang (zhaiwx1987) wrote : | #1 |
tags: | added: contribution |
zhai weixiang (zhaiwx1987) wrote : | #2 |
I found anther bug, after_flush should be called before signal_update, otherwise the binlog dump thread may be waked up first and missed to mark the packet header (to tell the slave that the master needs an ACK)
quoted code:(trx_
5901 else
5902 {
5903 signal_update();
5904 }
5905
5906 if (RUN_HOOK(
5907 (leader->thd, log_file_name, log_file.
5908 {
5909 sql_print_
5910 for (current= queue; current != NULL; current= current->next)
5911 {
5912 if (!current->error)
5913 {
So the attached file didn't fix this bug completely. I'll rewrite the patch based on the latest Percona Server 5.5.34
zhai weixiang (zhaiwx1987) wrote : | #3 |
patch based on Percona Server 5.5.34
zhai weixiang (zhaiwx1987) wrote : | #4 |
Sorry , comment of #2 is incorrect.
It's ok to call after_flush after signal_update because the LOCK_log was hold by the leader thread.
(It's a problem for me because I have split the LOCK_log for binlog write/dump).
I do miss some MTR test case in the patch, but hardly it makes any sense to have "New" status for this report. Let developers check and (hopefully) use this contribution.
Zhai -
Have you checked MariaDB 5.5 where our group commit implementation is taken from? Are they affected? Have they fixed this?
zhai weixiang (zhaiwx1987) wrote : | #7 |
I haven't check the mariadb carefully,just quickly read the related code . I guess mariadb was affected too...Hope someone from maria can help me verifying it.
bellow is a simple test case.
source include/
source include/
source include/
source include/
#Install semisync plugin on slave
connection slave;
--replace_regex /\.dll/.so/
eval INSTALL PLUGIN rpl_semi_sync_slave SONAME '$SEMISYNC_
select @@global.
show variables like '%semi%';
SET GLOBAL rpl_semi_
STOP SLAVE;
START SLAVE;
#Install semisync plugin on master
connection master;
--replace_regex /\.dll/.so/
eval INSTALL PLUGIN rpl_semi_
SET GLOBAL rpl_semi_
SET GLOBAL rpl_semi_
--sleep 2
CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=innodb;
connect(
connect(
connection master;
SET GLOBAL rpl_semi_
connection slave;
#Stop slave, then all transaction on master will wait for restart of slave;
STOP SLAVE;
connection con1;
#leader of group commit
SET DEBUG_SYNC= "commit_
con2_queued";
send INSERT INTO t1 VALUES (1);
connection con2;
#follower of group commit
SET DEBUG_SYNC= "now WAIT_FOR con1_waiting";
SET DEBUG_SYNC= "commit_
send INSERT INTO t1 VALUES (2);
connection master;
# we expact con1 and con2 should be both waiting for ACK
--sleep 1
SHOW PROCESSLIST;
connection slave;
START SLAVE;
--sleep 1
connection master;
SHOW PROCESSLIST;
#cleanup
DROP TABLE t1;
UNINSTALL PLUGIN rpl_semi_
connection slave;
UNINSTALL PLUGIN rpl_semi_
--source include/rpl_end.inc
Shahriyar Rzayev (rzayev-sehriyar) wrote : | #8 |
Percona now uses JIRA for bug reports so this bug report is migrated to: https:/
a simple patch based on the logic of MySQL 5.6. Can be patched on Percona Server 5.5.18 we are using now.