XtraDB does not scale well on 12 core system for specific all in memory SELECT

Bug #1099526 reported by Valerii Kravchuk on 2013-01-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Unassigned
5.1
Won't Fix
Medium
Unassigned
5.5
Won't Fix
Medium
Unassigned
5.6
Fix Released
Medium
Unassigned

Bug Description

For the query joining two tables on a long character primary key we see that on 12 cores it does not scale even to 10 threads. Tested with recent 5.5.28-rel29.3:

CREATE TABLE `incident` (
`sys_id` char(32) NOT NULL DEFAULT '',
`category` varchar(40) DEFAULT NULL,
PRIMARY KEY (`sys_id`),
KEY `incident_category` (`category`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into incident(sys_id) values (rand()*1000000);
insert into incident(sys_id) select rand()*1000000 from incident; -- 13 times

CREATE TABLE `task` (
`sys_id` char(32) NOT NULL DEFAULT '',
`u_root_cause` char(32) DEFAULT NULL,
`u_business_impact_description` mediumtext,
`u_business_impact_category` mediumtext,
PRIMARY KEY (`sys_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into task(sys_id) values(rand()*1000000);
insert into task(sys_id) select rand()*1000000 from task; -- 13 times

insert into task(sys_id) select sys_id from incident limit 100;

update incident set category=rand()*100000;

show table status like 'task'\G
show table status like 'incident'\G

Problematic query is:

select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category;

Here is what we get from mysqlslap (server was started with --no-defaults in this case, but all data fit well into memory and further tuning does NOT help to scale better it seems):

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=1 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 35.766 seconds
        Minimum number of seconds to run all queries: 35.488 seconds
        Maximum number of seconds to run all queries: 35.828 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1000

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=2 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 22.655 seconds
        Minimum number of seconds to run all queries: 20.245 seconds
        Maximum number of seconds to run all queries: 23.943 seconds
        Number of clients running queries: 2
        Average number of queries per client: 500

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 12.668 seconds
        Minimum number of seconds to run all queries: 10.320 seconds
        Maximum number of seconds to run all queries: 14.053 seconds
        Number of clients running queries: 4
        Average number of queries per client: 250

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 9.055 seconds
        Minimum number of seconds to run all queries: 8.533 seconds
        Maximum number of seconds to run all queries: 9.676 seconds
        Number of clients running queries: 8
        Average number of queries per client: 125

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=10 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 9.934 seconds
        Minimum number of seconds to run all queries: 9.503 seconds
        Maximum number of seconds to run all queries: 10.310 seconds
        Number of clients running queries: 10
        Average number of queries per client: 100

Here total time to run 1000 SELECTs went up already.

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=12 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 11.015 seconds
        Minimum number of seconds to run all queries: 10.520 seconds
        Maximum number of seconds to run all queries: 11.889 seconds
        Number of clients running queries: 12
        Average number of queries per client: 83

At this moment we have the following waits:

mysql> show engine innodb mutex;
+--------+---------------------------+-----------------+
| Type | Name | Status |
+--------+---------------------------+-----------------+
| InnoDB | &log_sys->mutex | os_waits=1 |
| InnoDB | &kernel_mutex | os_waits=1 |
| InnoDB | combined &block->mutex | os_waits=118933 |
| InnoDB | &dict_operation_lock | os_waits=2 |
| InnoDB | &log_sys->checkpoint_lock | os_waits=113 |
+--------+---------------------------+-----------------+
5 rows in set (0.00 sec)

So, it is all about combined &block->mutex it seems in case of Percona Server/XtraDB.

Surely with more threads throughput degrades further:

[valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=24 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count(*), category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
Benchmark
        Average number of seconds to run all queries: 14.353 seconds
        Minimum number of seconds to run all queries: 13.935 seconds
        Maximum number of seconds to run all queries: 14.467 seconds
        Number of clients running queries: 24
        Average number of queries per client: 41

Question is: why it does not scale even to 10 threads, if simpler scale to 24 and even 48 on this same box with these same tables?

See upstream bug (there waits info is different), http://bugs.mysql.com/bug.php?id=68079, for more details (and dump to load for correct comparison).

Perhaps some PMP stacktraces would be useful to see here.

Download full text (3.6 KiB)

While running with 10 threads and --no-defaults --innodb_sync_spin_loops=200

[valerii.kravchuk@cisco1 ~]$ ./pt-pmp -i 10 | more
Fri Jan 25 03:04:24 PST 2013
    100 libaio::??,os_aio_linux_collect,os_aio_linux_handle,fil_aio_wait,io_hand
ler_thread,start_thread,clone
     10 sigwait,signal_hand,start_thread,clone
     10 select,os_thread_sleep,srv_master_thread,start_thread,clone
     10 select,os_thread_sleep,srv_LRU_dump_restore_thread,start_thread,clone
     10 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,m
utex_spin_wait,pfs_mutex_enter_func,buf_page_release,mtr_memo_slot_release,mtr_m
emo_pop_all,mtr_commit,row_search_for_mysql,ha_innobase::index_read,join_read_ke
y,sub_select,evaluate_join_record,sub_select,do_select,JOIN::exec,mysql_select,h
andle_select,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_co
mmand,do_command,do_handle_one_connection,handle_one_connection,start_thread,clo
ne
     10 pthread_cond_wait,os_cond_wait,os_event_wait_low,srv_purge_thread,start_
thread,clone
     10 pthread_cond_timedwait,os_cond_wait_timed,os_event_wait_time_low,srv_mon
itor_thread,start_thread,clone
     10 pthread_cond_timedwait,os_cond_wait_timed,os_event_wait_time_low,srv_loc
k_timeout_thread,start_thread,clone
     10 pthread_cond_timedwait,os_cond_wait_timed,os_event_wait_time_low,srv_err
or_monitor_thread,start_thread,clone
     10 poll,handle_connections_sockets,mysqld_main,__libc_start_main,_start
      8 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,m
utex_spin_wait,mutex_enter_func,pfs_mutex_enter_func,buf_page_get_mutex_enter,bu
f_page_get_gen,btr_cur_search_to_nth_level,btr_pcur_open_with_no_init_func,row_s
el_try_search_shortcut_for_mysql,row_search_for_mysql,ha_innobase::index_read,jo
in_read_key,sub_select,evaluate_join_record,sub_select,do_select,JOIN::exec,mysq
l_select,handle_select,execute_sqlcom_select,mysql_execute_command,mysql_parse,d
ispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_
thread,clone
      5 rec_init_offsets,rec_get_offsets_func,page_cur_search_with_match,btr_cur
_search_to_nth_level,btr_pcur_open_with_no_init_func,row_sel_try_search_shortcut
_for_mysql,row_search_for_mysql,ha_innobase::index_read,join_read_key,sub_select
,evaluate_join_record,sub_select,do_select,JOIN::exec,mysql_select,handle_select
,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_com
mand,do_handle_one_connection,handle_one_connection,start_thread,clone
      4 ut_delay,mutex_spin_wait,mutex_enter_func,pfs_mutex_enter_func,buf_page_
get_mutex_enter,buf_page_get_gen,btr_cur_search_to_nth_level,btr_pcur_open_with_
no_init_func,row_sel_try_search_shortcut_for_mysql,row_search_for_mysql,ha_innob
ase::index_read,join_read_key,sub_select,evaluate_join_record,sub_select,do_sele
ct,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,mysql_execute_com
mand,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one
_connection,start_thread,clone
      4 my_utf8_uni,my_strnncollsp_utf8,innobase_mysql_cmp,cmp_whole_field,cmp_d
tuple_rec_with_match,page_cur_search_with_match,btr_cur_search_to_nth_level,...

Read more...

tags: added: upstream
tags: added: innodb

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1297

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.