How about compressing binlog?

Bug #1510426 reported by choury
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Triaged
Wishlist
Unassigned
5.5
Triaged
Wishlist
Unassigned
5.6
Triaged
Wishlist
Unassigned

Bug Description

Hi All,

I'm a dba of Tencent Inc. We make extensive use of MySQL as our database in our business.

But in our businesses (mainly are games), there is usually a lot of binlog generated in a short time. For example, a game called "QQ Dancer", it has one million online players at the same time, we use more than 200 machines as its database server, they generate about 1.6T binlog per hour altogether. This is the result of using MIXED mode, it will be even larger in ROW mode, However, we have more than 10 games just like it. Such a large amount of binlog will not only take up disk space, it will also take a lot of network bandwidth, and it is very difficult to make a long-distance backup. We have searched for solution and only found these:
https://bugs.mysql.com/bug.php?id=48396
https://bugs.mysql.com/bug.php?id=46435
I don't know why it hasn't been implemented so long. Given this, we made an idea of compressing a binlog when generating and we have already implemented it.

The solution is as follows:
We added some event types for the compressed edition of event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write function have been overridden for uncompressing and compressing. Anything but this is totally the same. And the format of these events can be described by this picture:
http://i.imgur.com/4Kf80Tr.png

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

We also added two options for this feature: "log_bin_compress " and "log_bin_compress_min_len", the former is a switch of whether the binlog should be compressed and the latter is the minimum length of sql statement(in statement mode) or record(in row mode) that can be compressed. All can be described by the following code:

 if binlog_format == statement {
          if log_bin_compress == true and query_len >= log_bin_compress_min_len
             create a Query_compressed_log_event;
          else
             create a Query_log_event;
 }
 if binlog_format == row {
          if log_bin_compress == true and record_len >= log_bin_compress_min_len
             create a Write_rows_compressed_log_event(when INSERT)
          else
             create a Write_log_event(when INSERT);
 }

The complete change for Percona 5.6 can be found by:
https://github.com/choury/percona-server/commit/bdf5a83164ff19a5017cde6507427c0b5bc70645
We have tested it on some of our games for months, and the result is obvious: the amount of binlog is reduced by 42% ~ 70%. We will be very glad if you can accept our patch.

If you have any other questions, please don't hesitate to reply to me!

Tags: contribution
tags: added: contribution
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-2473

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.