TokuDB engine enabled causes extra disk writes and general slowdown for the replication thread

Bug #1550352 reported by Przemek on 2016-02-26
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server
Status tracked in 5.7
5.6
Medium
Unassigned
5.7
Medium
Unassigned

Bug Description

There is a pretty weird problem - when TokuDB engine is enabled, replication is much slower even if no table is actually using it.
I've noted over two times slower applying binlog speed for changes against generic sysbench table using InnoDB engine, but customer reported around four times difference in another environment.
What is odd, is that the problem is only reproducible when the variable log_bin=OFF. When the log_bin=ON (log_slave_updates=OFF), the applying speed is the same as on instance without TokuDB engine enabled.

I can also observe around two times higher disk activity during the test with TokuDB engine enabled vs disabled (or enabled but log_bin OFF vs ON).

Below are some test examples. The used disk device was not used by any other process at the time. Testing TokuDB tables gives similar results, applying is much slower with log_bin=OFF, and produces more disk writes.

1 - InnoDB tables, TokuDB engine enabled, log_bin=OFF

slave1 [localhost] {msandbox} ((none)) > select @@version,@@version_comment;
+-------------+------------------------------------------------------+
| @@version | @@version_comment |
+-------------+------------------------------------------------------+
| 5.6.28-76.1 | Percona Server (GPL), Release 76.1, Revision 5759e76 |
+-------------+------------------------------------------------------+
1 row in set (0.00 sec)

slave1 [localhost] {msandbox} ((none)) > show variables like '%engin%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| default_storage_engine | InnoDB |
| default_tmp_storage_engine | InnoDB |
| enforce_storage_engine | |
| storage_engine | InnoDB |
+----------------------------+--------+
4 rows in set (0.00 sec)

slave1 [localhost] {msandbox} (test) > show engines;
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
...
| TokuDB | DEFAULT | Percona TokuDB Storage Engine with Fractal Tree(tm) Technology | YES | YES | YES |
...

slave1 [localhost] {msandbox} ((none)) > select TABLE_SCHEMA,TABLE_NAME, ENGINE from information_schema.tables where ENGINE="TokuDB";
Empty set (0.02 sec)

-- test
2016-02-26 06:20:46 47483 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000009' at position 674081141, relay log './mysql-relay.000029' position: 16274542
2016-02-26 06:31:36 47483 [Note] Slave SQL thread stopped because it reached its UNTIL position 4
+--------------------------------+
| SUBTIME('06:31:36','06:20:46') |
+--------------------------------+
| 00:10:50 |
+--------------------------------+
$ grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 06:20:44 EST 2016
18347313 880475096

$ grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 06:33:01 EST 2016
19672748 899370584

19672748-18347313=1325435 (wIOPS)
899370584-880475096=18895488 (sectors written)
9226.3 MB written

2 - InnoDB tables, TokuDB enabled, log_bin=ON

-- test
2016-02-26 06:36:26 396 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000009' at position 674081141, relay log './mysql-relay.000029' position: 16274542
2016-02-26 06:40:29 396 [Note] Slave SQL thread stopped because it reached its UNTIL position 4

+--------------------------------+
| SUBTIME('06:40:29','06:36:26') |
+--------------------------------+
| 00:04:03 |
+--------------------------------+

$ date; grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 06:36:24 EST 2016
19693970 902132188

$ date; grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 06:41:43 EST 2016
20216815 911836779
20216815-19693970=522845 (wIOPS)
911836779-902132188=9704591 (sectors written)
4738.57 MB written

3 - InnoDB tables, TokuDB disabled, log_bin=OFF

2016-02-26 07:08:58 4915 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000009' at position 674081141, relay log './mysql-relay.000029' position: 16274542
2016-02-26 07:12:58 4915 [Note] Slave SQL thread stopped because it reached its UNTIL position 4
+--------------------------------+
| SUBTIME('07:12:58','07:08:58') |
+--------------------------------+
| 00:04:00 |
+--------------------------------+

$ date; grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 07:08:56 EST 2016
20784700 927085891

$ date; grep nvme0n1 /proc/diskstats |awk '{print $8,$10}'
Fri Feb 26 07:21:13 EST 2016
21306069 936745984
21306069-20784700=521369
936745984-927085891=9660093
4716.84 MB

Jervin R (revin) wrote :

@przemek Can you measure IO activity on tc.log when log_bin=OFF.

George Ormond Lorch III (gl-az) wrote :

This is likely because MySQL will use the TC_LOG_MMAP transaction coordinator instead of the MYSQL_BIN_LOG transaction coordinator when TokuDB is enabled which does more fsyncs (because there are now 2 transactional engines enabled within the server).

Vadim has already discovered this phenomenon in his testing in the past and is sort of a known issue. It seems that the MMAP coordinator is not very well optimized.

Jervin R (revin) wrote :

@George, indeed, what I suspect as well based on the tests.

tags: added: upstream
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.