replication delay too much on tokudb partition table

Bug #1740459 reported by HolmerD
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
New
Undecided
Unassigned

Bug Description

Hi:
my slave is delay too much, the TPS and QPS very low, memory, io(fusion-io), cpu everything is ok.
analye the latest six binlog files, found 99% binlog event is abount table `abtest_monitor_fac`(partition table), event type is: Write_rows, Update_rows.

table message:
*************************** 1. row ***************************
       Table: abtest_monitor_fac
Create Table: CREATE TABLE `abtest_monitor_fac` (
  `day` int(11) NOT NULL COMMENT '20161022',
  `time` bigint(11) NOT NULL COMMENT '201610221100',
  `hash_id` bigint(20) unsigned NOT NULL COMMENT 'logic id',
  `pv` bigint(20) NOT NULL COMMENT 'pv`',
  `used_time` bigint(11) NOT NULL COMMENT '耗时 微秒',
  `uv` bigint(20) NOT NULL COMMENT 'uv`',
  `last_update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '最后修改时间',
  PRIMARY KEY (`time`,`hash_id`),
  KEY `idx_day` (`day`),
  KEY `idx_last_update_time` (`last_update_time`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8
/*!50500 PARTITION BY RANGE COLUMNS(`time`)
(PARTITION p20161028 VALUES LESS THAN (20161029000000) ENGINE = TokuDB,
........
PARTITION p20171230 VALUES LESS THAN (20171231000000) ENGINE = TokuDB) */

+--------------------+----------------------+----------+
| table_name | PARTITION_EXPRESSION | p_count |
+--------------------+----------------------+----------+
| abtest_monitor_fac | `time` | 429 |
+--------------------+----------------------+----------+

+--------------------+--------+-------------+------------------+------------------+
| table_name | engine | table_rows | data_gb | index_gb |
+--------------------+--------+-------------+------------------+------------------+
| abtest_monitor_fac | TokuDB | 16723416943 | 763.579869242385 | 673.313777318224 |
+--------------------+--------+-------------+------------------+------------------+

production environment: single master slave
version: Percona-Server-5.6.33

lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Stepping: 4
CPU MHz: 3000.195
BogoMIPS: 5999.11
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-39

cpu used:
12:36:16 PM PID %usr %system %guest %CPU CPU Command
12:36:17 PM 5220 100.00 3.00 0.00 100.00 10 mysqld
12:36:18 PM 5220 100.00 3.00 0.00 100.00 0 mysqld
12:36:19 PM 5220 100.00 3.00 0.00 100.00 2 mysqld
12:36:20 PM 5220 100.00 2.00 0.00 100.00 1 mysqld
12:36:21 PM 5220 100.00 2.00 0.00 100.00 11 mysqld
12:36:22 PM 5220 100.00 3.00 0.00 100.00 10 mysqld
12:36:23 PM 5220 100.00 4.00 0.00 100.00 1 mysqld
12:36:24 PM 5220 100.00 4.00 0.00 100.00 11 mysqld
12:36:25 PM 5220 100.00 2.00 0.00 100.00 13 mysqld
12:36:26 PM 5220 100.00 5.00 0.00 100.00 10 mysqld
12:36:27 PM 5220 100.00 2.00 0.00 100.00 0 mysqld
12:36:28 PM 5220 100.00 3.00 0.00 100.00 10 mysqld
12:36:29 PM 5220 100.00 11.00 0.00 100.00 12 mysqld
12:36:30 PM 5220 100.00 5.00 0.00 100.00 0 mysqld

memory used:
             total used free shared buffers cached
Mem: 126 123 2 0 0 48
-/+ buffers/cache: 74 51
Swap: 31 0 31

perf record message:
# Overhead sys usr Samples Command Shared Object
# ........ ........ ........ ............ ....... ......................................................................................... ..................
#
    15.89% 0.00% 15.89% 40054 mysqld /apps/svr/Percona-Server-5.6.33-rel79.0-Linux.x86_64.ssl101/bin/mysqld 0x4f5245
             |
             --- bitmap_get_next_set
                |
                |--50.36%-- ha_partition::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                | Rows_log_event::do_apply_event(Relay_log_info const*)
                | slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                | handle_slave_worker
                | start_thread
                |
                |--48.26%-- ha_partition::rnd_end()
                | Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                | Rows_log_event::do_apply_event(Relay_log_info const*)
                | slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                | handle_slave_worker
                | start_thread
                 --1.38%-- [...]

     7.30% 0.00% 7.30% 18401 mysqld /apps/svr/Percona-Server-5.6.33-rel79.0-Linux.x86_64.ssl101/lib/mysql/plugin/ha_tokudb.so 0x5a48e
             |
             --- ha_tokudb::index_init(unsigned int, bool)
                |
                |--98.76%-- ha_tokudb::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | ha_partition::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                | Rows_log_event::do_apply_event(Relay_log_info const*)
                | slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                | handle_slave_worker
                | start_thread
                |
                 --1.24%-- handler::ha_rnd_init(bool)
                           ha_partition::rnd_init(bool)
                           handler::ha_rnd_init(bool)
                           Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                           Rows_log_event::do_apply_event(Relay_log_info const*)
                           slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                           handle_slave_worker
                           start_thread

     6.21% 0.00% 6.21% 15657 mysqld /apps/svr/Percona-Server-5.6.33-rel79.0-Linux.x86_64.ssl101/lib/mysql/plugin/ha_tokudb.so 0xaf438
             |
             --- toku_db_cursor_internal(__toku_db*, __toku_db_txn*, __toku_dbc*, unsigned int, int)
                |
                |--98.99%-- toku_db_cursor(__toku_db*, __toku_db_txn*, __toku_dbc**, unsigned int)
                | ha_tokudb::index_init(unsigned int, bool)
                | ha_tokudb::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | ha_partition::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                | Rows_log_event::do_apply_event(Relay_log_info const*)
                | slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                | handle_slave_worker
                | start_thread
                |
                |--0.84%-- ha_tokudb::index_init(unsigned int, bool)
                | ha_tokudb::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | ha_partition::rnd_init(bool)
                | handler::ha_rnd_init(bool)
                | Rows_log_event::do_index_scan_and_update(Relay_log_info const*)
                | Rows_log_event::do_apply_event(Relay_log_info const*)
                | slave_worker_exec_job(Slave_worker*, Relay_log_info*)
                | handle_slave_worker
                | start_thread
                 --0.17%-- [...]

Is this a bug?
And how to fix it?

HolmerD (dong.dg)
information type: Private Security → Public
HolmerD (dong.dg)
information type: Public → Public Security
information type: Public Security → Public
tags: added: performance tokudb
Revision history for this message
HolmerD (dong.dg) wrote :

This bug fixed in 5.6.38, 5.7.20.
https://github.com/mysql/mysql-server/commit/9414c7035020a1915db56101aa7929d8b4069d83#diff-8c78bfc758be72b7df2306f1d9984c47

If slave_rows_search_algorithms is set to TABLE_SCAN,HASH_SCAN there is a performance improvement.

set global slave_rows_search_algorithms='TABLE_SCAN,HASH_SCAN';

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3770

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.