Upstream fix for http://bugs.mysql.com/bug.php?id=70768 causes RW lock creations and destructions at a high rate during ibuf merge

Bug #1279671 reported by Laurynas Biveinis on 2014-02-13
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server
Status tracked in 5.6
5.1
Undecided
Unassigned
5.5
Undecided
Unassigned
5.6
Medium
Laurynas Biveinis

Bug Description

MySQL 5.6.16 fixed http://bugs.mysql.com/bug.php?id=70768 by changing dict_table_t stat-related member locking. Instead of a static array of 64 shared latches, to eliminate contention on these, each dict_table_t object got its own dynamically created and destroyed latch.

The short-lived private dict_table_t objects such as those used in ibuf merge got these dynamic rwlatches too. Each (non-priority) rwlatch creation performs heap memory allocations, acquires rw_lock_list_mutex, creates two OS events, which in turn acquire os_sync_mutex.

Parts of a Perf report on the preliminary PS-5.6.16 branch.

- 14,88% mysqld mysqld [.] ut_delay(unsigned long)
   - ut_delay(unsigned long)
      - 90,02% mutex_spin_wait(void*, bool, char const*, unsigned long)
         - 47,73% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
         - 46,52% rw_lock_free_func(rw_lock_t*) - dict_mem_table_free(dict_table_t*)
               + 92,34% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,54% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,11% ibuf_rec_get_volume_func(unsigned char const*)
...
- 6,35% mysqld libpthread-2.12.so [.] pthread_mutex_lock
   - pthread_mutex_lock
      - 35,31% os_event_create()
         - 60,64% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
         - 39,36% mutex_create_func(ib_mutex_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 19,28% sync_array_reserve_cell(sync_array_t*, void*, unsigned long, char const*, unsigned long, unsigned long*)
         - 99,16% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,51% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 49,26% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 92,81% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,48% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,71% ibuf_rec_get_volume_func(unsigned char const*)
         + 0,82% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
      - 18,17% os_fast_mutex_free_func(pthread_mutex_t*)
         - 67,53% os_event_free(os_event*)
            - 59,21% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 91,27% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,96% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 1,77% ibuf_rec_get_volume_func(unsigned char const*)
            - 40,79% mutex_free_func(ib_mutex_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 91,62% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,85% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 1,53% ibuf_rec_get_volume_func(unsigned char const*)
         - 32,47% dict_mem_index_free(dict_index_t*)
            + 89,70% ibuf_dummy_index_free(dict_index_t*)
            + 8,57% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
            + 1,73% ibuf_rec_get_volume_func(unsigned char const*)
      - 7,98% os_event_free(os_event*)
         - 53,98% rw_lock_free_func(rw_lock_t*)
            - dict_mem_table_free(dict_table_t*)
               + 91,23% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 7,07% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,69% ibuf_rec_get_volume_func(unsigned char const*)
         - 46,02% mutex_free_func(ib_mutex_t*)
            - dict_mem_table_free(dict_table_t*)
               + 91,32% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,99% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,69% ibuf_rec_get_volume_func(unsigned char const*)
      - 7,17% sync_array_wait_event(sync_array_t*, unsigned long)
         - 98,83% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,14% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 49,35% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 92,97% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,19% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,84% ibuf_rec_get_volume_func(unsigned char const*)
         + 1,14% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
      - 6,76% os_fast_mutex_init_func(pthread_mutex_t*)
           dict_mem_index_create(char const*, char const*, unsigned long, unsigned long, unsigned long)
         + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 3,89% mutex_signal_object(ib_mutex_t*)
         - 81,44% rw_lock_free_func(rw_lock_t*)
            - dict_mem_table_free(dict_table_t*)
               + 92,90% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,37% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 0,73% ibuf_rec_get_volume_func(unsigned char const*)
         - 18,19% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 0,76% sync_array_free_cell(sync_array_t*, unsigned long)
         - 87,03% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,19% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 46,32% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 93,59% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 5,77% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,64% ibuf_rec_get_volume_func(unsigned char const*)
            + 1,88% pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long)
            + 0,87% mtr_commit(mtr_t*)
            + 0,53% _ZL20pfs_mutex_enter_funcP10ib_mutex_tPKcm.clone.2
         + 12,46% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
- 3,68% mysqld mysqld [.] mutex_spin_wait(void*, bool, char const*, unsigned long)
   - mutex_spin_wait(void*, bool, char const*, unsigned long)
      - 48,40% rw_lock_create_func(rw_lock_t*, char const*)
           dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
         + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 46,88% rw_lock_free_func(rw_lock_t*)
         - dict_mem_table_free(dict_table_t*)
            + 92,22% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
            + 6,71% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
            + 1,07% ibuf_rec_get_volume_func(unsigned char const*)

The above causes a performance regression on MySQL 5.6.16 merge in TPS of RW Sysbench (I/O bound (!)): about 7% drop at 512 threads, 20% at 1024 threads.

The same reasoning as for dict_table_t::stats_latch probably applies to dict_table_t::autoinc_mutex too, which is not a regression but a pre-existing inefficiency. Considering fixing at the same time.

description: updated
tags: added: innodb performance upstream
description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.