Semi-sync replication performance degrades with high number of threads
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MySQL Server |
Unknown
|
Unknown
|
||||
Percona Server moved to https://jira.percona.com/projects/PS | Status tracked in 5.7 | |||||
5.5 |
Invalid
|
Undecided
|
Unassigned | |||
5.6 |
Fix Released
|
Medium
|
Unassigned | |||
5.7 |
Triaged
|
Medium
|
Unassigned |
Bug Description
* Taken verbatim from mysql bug report filed by Rene Cannao
Description:
Based on our experience (issues in productions and easily reproducible in testing environment), when semi-sync replication is enabled mysqld is able to handle around 6k TRX/s when the number of threads running is relatively low (100-200 connections) but degrades when the number of threads running go beyond a certain threshold. With 3000 threads running, throughput is no more than 300 TRX/s .
While at low number of threads running the bottleneck seems to be network rtt , at high number of threads network activity drops, and we also notice a constantly high number of context switches.
Trying to combine the output of pt-pmp (I think the relevant part is what is listed below) and semi-sync source code, we believe there is a high contention on LOCK_binlog_ in order to compare binlog coordinates of ACK from slave(s) and the binlog coordinates of each thread that issued a commit, and what seems an inefficient way to wake up threads.
2459 pthread_
541 pthread_
...
1 __lll_lock_
Each transaction thread waiting an ACK does the follow (simplified):
- hold a lock;
- in a loop ( while (is_on()) ) :
- compare the ACK position with its own position;
- it the ACK is still behind, it waits on a condition variable
In reportReplyBinlog , if there is at least one thread that is waiting an ACK , it sends a broadcast to all the threads.
This also create a contention on LOCK_log in MYSQL_BIN_
What seems to be a design flaw is that all the threads wake up, and while it is likely that they will all return to the application in a scenarios with few threads running, with a lot of threads running perhaps only few will return to the application and most of them will go back to wait on the same condition variable. This creates a lot of context switch and CPU get too busy in performing such operations that is not be able to perform a lot of progress with replication.
Attached is the output of pt-pmp resulting from the follow command:
pt-pmp --iterations=1 --save-
When this was executed, mysqld was processing the traffic generated by sysbench in a write intensive workload running 3000 threads.
How to repeat:
Setup semi-sync replication.
Run a write intensive workload against the master with few threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-
Run a write intensive workload against the master with a lot of threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-
As a comparison, run the above workload without semi-sync enabled. Run this to verify that the bottleneck is semi-sync.
Suggested fix:
Few spin loops before suspending the thread with pthread_
tags: | added: upstream |
Verified with PS 5.6.
---------- With --num-threads=30 and normal replication
nilnandan@ desktop: ~$ sysbench --max-requests=0 --max-time=900 --test= /usr/share/ doc/sysbench/ tests/db/ oltp.lua --mysql-user=root --mysql- password= msandbox --mysql- socket= /tmp/mysql_ sandbox21087. sock --mysql-db=dbtest --db-driver=mysql --db-ps- mode=disable --oltp- table-size= 10000000 --oltp- point-selects= 1 --oltp- index-updates= 0 --oltp- simple- ranges= 0 --oltp-sum-ranges=0 --oltp- order-ranges= 0 --oltp- distinct- ranges= 0 --num-threads=30 run
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 30
Random number generator seed is 0 and will be ignored
Threads started!
OLTP test statistics: desktop: ~$
..
transactions: 1980700 (2200.77 per sec.)
read/write requests: 7922800 (8803.08 per sec.)
other operations: 3961400 (4401.54 per sec.)
...
nilnandan@
---------- With --num-threads=30 and semi-sync replication
nilnandan@ desktop: ~$ nilnandan@ desktop: ~$ sysbench --max-requests=0 --max-time=900 --test= /usr/share/ doc/sysbench/ tests/db/ oltp.lua --mysql-user=root --mysql- password= msandbox --mysql- socket= /tmp/mysql_ sandbox21087. sock --mysql-db=dbtest --db-driver=mysql --db-ps- mode=disable --oltp- table-size= 10000000 --oltp- point-selects= 1 --oltp- index-updates= 0 --oltp- simple- ranges= 0 --oltp-sum-ranges=0 --oltp- order-ranges= 0 --oltp- distinct- ranges= 0 --num-threads=30 run
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 30
Random number generator seed is 0 and will be ignored
Threads started!
OLTP test statistics: desktop: ~$
...
transactions: 1463881 (1626.51 per sec.)
read/write requests: 5855524 (6506.03 per sec.)
other operations: 2927762 (3253.02 per sec.)
...
nilnandan@
---------- With --num-threads=300 and normal replication
nilnandan@ desktop: ~$ sysbench --max-requests=0 --max-time=900 --test= /usr/share/ doc/sysbench/ tests/db/ oltp.lua --mysql-user=root --mysql- password= msandbox --mysql- socket= /tmp/mysql_ sandbox21087. sock --mysql-db=dbtest --db-driver=mysql --db-ps- mode=disable --oltp- table-size= 10000000 --oltp- point-selects= 1 --oltp- index-updates= 0 --oltp- simple- ranges= 0 --oltp-sum-ranges=0 --oltp- order-ranges= 0 --oltp- distinct- ranges= 0 --num-threads=300 run
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 300
Random number generator seed is 0 and will be ignored
Threads started!
OLTP test statistics:
...
transactions: 2425613 (2694.89 per sec.)
read/write requests: 9702452 (10779.54 per sec.)
other operations: 4851226 (5389.77 per sec.)
..
---------- With --num-threads=300 and semi-sync replication
nilnandan@ desktop: ~$ sysbench --max-requests=0 --max-time=900 --test= /usr/share/ doc/sysbench/ tests/db/ oltp.lua --mysql-user=root --mysql- password= msandbox --mysql- socket= /tmp/mysql_ sandbox21087. sock --mysql-db=dbtest --db-driver=mysql --db-ps- mode=disable --oltp-...