XtraDB does not scale well on 12 core system for specific all in memory SELECT
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL Server |
Unknown
|
Unknown
|
|||
Percona Server moved to https://jira.percona.com/projects/PS |
Fix Released
|
Medium
|
Unassigned | ||
5.1 |
Won't Fix
|
Medium
|
Unassigned | ||
5.5 |
Won't Fix
|
Medium
|
Unassigned | ||
5.6 |
Fix Released
|
Medium
|
Unassigned |
Bug Description
For the query joining two tables on a long character primary key we see that on 12 cores it does not scale even to 10 threads. Tested with recent 5.5.28-rel29.3:
CREATE TABLE `incident` (
`sys_id` char(32) NOT NULL DEFAULT '',
`category` varchar(40) DEFAULT NULL,
PRIMARY KEY (`sys_id`),
KEY `incident_category` (`category`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into incident(sys_id) values (rand()*1000000);
insert into incident(sys_id) select rand()*1000000 from incident; -- 13 times
CREATE TABLE `task` (
`sys_id` char(32) NOT NULL DEFAULT '',
`u_root_cause` char(32) DEFAULT NULL,
`u_business_
`u_business_
PRIMARY KEY (`sys_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into task(sys_id) values(
insert into task(sys_id) select rand()*1000000 from task; -- 13 times
insert into task(sys_id) select sys_id from incident limit 100;
update incident set category=
show table status like 'task'\G
show table status like 'incident'\G
Problematic query is:
select count(*), category from task inner join incident on task.sys_
Here is what we get from mysqlslap (server was started with --no-defaults in this case, but all data fit well into memory and further tuning does NOT help to scale better it seems):
[valerii.
Benchmark
Average number of seconds to run all queries: 35.766 seconds
Minimum number of seconds to run all queries: 35.488 seconds
Maximum number of seconds to run all queries: 35.828 seconds
Number of clients running queries: 1
Average number of queries per client: 1000
[valerii.
Benchmark
Average number of seconds to run all queries: 22.655 seconds
Minimum number of seconds to run all queries: 20.245 seconds
Maximum number of seconds to run all queries: 23.943 seconds
Number of clients running queries: 2
Average number of queries per client: 500
[valerii.
Benchmark
Average number of seconds to run all queries: 12.668 seconds
Minimum number of seconds to run all queries: 10.320 seconds
Maximum number of seconds to run all queries: 14.053 seconds
Number of clients running queries: 4
Average number of queries per client: 250
[valerii.
Benchmark
Average number of seconds to run all queries: 9.055 seconds
Minimum number of seconds to run all queries: 8.533 seconds
Maximum number of seconds to run all queries: 9.676 seconds
Number of clients running queries: 8
Average number of queries per client: 125
[valerii.
Benchmark
Average number of seconds to run all queries: 9.934 seconds
Minimum number of seconds to run all queries: 9.503 seconds
Maximum number of seconds to run all queries: 10.310 seconds
Number of clients running queries: 10
Average number of queries per client: 100
Here total time to run 1000 SELECTs went up already.
[valerii.
Benchmark
Average number of seconds to run all queries: 11.015 seconds
Minimum number of seconds to run all queries: 10.520 seconds
Maximum number of seconds to run all queries: 11.889 seconds
Number of clients running queries: 12
Average number of queries per client: 83
At this moment we have the following waits:
mysql> show engine innodb mutex;
+------
| Type | Name | Status |
+------
| InnoDB | &log_sys->mutex | os_waits=1 |
| InnoDB | &kernel_mutex | os_waits=1 |
| InnoDB | combined &block->mutex | os_waits=118933 |
| InnoDB | &dict_operation
| InnoDB | &log_sys-
+------
5 rows in set (0.00 sec)
So, it is all about combined &block->mutex it seems in case of Percona Server/XtraDB.
Surely with more threads throughput degrades further:
[valerii.
Benchmark
Average number of seconds to run all queries: 14.353 seconds
Minimum number of seconds to run all queries: 13.935 seconds
Maximum number of seconds to run all queries: 14.467 seconds
Number of clients running queries: 24
Average number of queries per client: 41
Question is: why it does not scale even to 10 threads, if simpler scale to 24 and even 48 on this same box with these same tables?
See upstream bug (there waits info is different), http://
tags: | added: upstream |
tags: | added: innodb |
Perhaps some PMP stacktraces would be useful to see here.