Memory leak in multi-source replication when binlog_rows_query_log_events is enabled

Bug #1667307 reported by Przemek on 2017-02-23
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Invalid
Undecided
Unassigned
5.7
Fix Released
High
Vlad Lesin

Bug Description

This is reproducible on both MySQL Community and Percona Server 5.7.17.
Slave SQL thread leaks memory when replicating using two channels from master-master pair. Not using binlog_rows_query_log_events seems to allow to avoid the problem.

SQL thread restart doesn't release the memory.

Steps to reproduce using MySQLSandbox below. The more sysbench requests, the bigger memory usage, which seems to never stop growing.

$ ./multi_source.sh percona5.7.17 mysql ALL-MASTERS
installing node 1
installing node 2
installing node 3
installing node 4
group directory installed in $HOME/sandboxes/multi_msb_percona5_7_17
...
# Setting topology ALL-MASTERS
# node node1
...

cd multi_msb_percona5_7_17

node1 [localhost] {msandbox} ((none)) > stop slave for channel 'node3'; reset slave all for channel 'node3';
node1 [localhost] {msandbox} ((none)) > stop slave for channel 'node4'; reset slave all for channel 'node4';

node2 [localhost] {msandbox} ((none)) > stop slave for channel 'node3'; reset slave all for channel 'node3';
node2 [localhost] {msandbox} ((none)) > stop slave for channel 'node4'; reset slave all for channel 'node4';

node3 [localhost] {msandbox} ((none)) > stop slave for channel 'node4'; reset slave all for channel 'node4';

node4 [localhost] {msandbox} ((none)) > stop slave for channel 'node3'; reset slave all for channel 'node3';
node4 [localhost] {msandbox} ((none)) > stop slave for channel 'node2'; reset slave all for channel 'node2';
node4 [localhost] {msandbox} ((none)) > stop slave for channel 'node1'; reset slave all for channel 'node1';

$ for i in {1..3}; do echo "binlog_rows_query_log_events=1" >> node$i/my.sandbox.cnf ; done
$ for i in {1..3}; do echo "performance-schema-instrument='memory%=ON' " >> node$i/my.sandbox.cnf ; done
$ for i in {1..3}; do echo "log_slave_updates=1" >> node$i/my.sandbox.cnf ; done
$ ./restart_all

$ ./use_all "SELECT * FROM performance_schema.replication_connection_status\G";
# server: 1:
*************************** 1. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME:
              SOURCE_UUID: 00014419-2222-2222-2222-222222222222
                THREAD_ID: 26
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 1
 LAST_HEARTBEAT_TIMESTAMP: 2017-02-23 08:50:23
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
# server: 2:
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME:
              SOURCE_UUID: 00014418-1111-1111-1111-111111111111
                THREAD_ID: 26
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 2
 LAST_HEARTBEAT_TIMESTAMP: 2017-02-23 08:50:15
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
# server: 3:
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME:
              SOURCE_UUID: 00014418-1111-1111-1111-111111111111
                THREAD_ID: 26
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 1
 LAST_HEARTBEAT_TIMESTAMP: 2017-02-23 08:50:07
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME:
              SOURCE_UUID: 00014419-2222-2222-2222-222222222222
                THREAD_ID: 29
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 1
 LAST_HEARTBEAT_TIMESTAMP: 2017-02-23 08:50:07
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00

$ ./use_all "select * from sys.memory_global_total";
# server: 1:
total_allocated
332.57 MiB
# server: 2:
total_allocated
332.56 MiB
# server: 3:
total_allocated
332.64 MiB
# server: 4:
total_allocated
332.18 MiB

$ ps aux|grep multi_msb_percona5_7_17|grep .sock|awk '{print "vsz: "$5/1024" MB","rss: "$6/1024" MB",$19}'
vsz: 712.836 MB rss: 245.57 MB --port=14418
vsz: 712.836 MB rss: 230.875 MB --port=14419
vsz: 712.836 MB rss: 237.129 MB --port=14420

node1 [localhost] {msandbox} ((none)) > create database sbtest1;
Query OK, 1 row affected (0.00 sec)

$ sysbench --num-threads=16 --max-requests=300000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=sbtest1 --mysql-user=msandbox --mysql-password=msandbox --oltp-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox14418.sock --test=/usr/share/doc/sysbench/tests/db/oltp.lua prepare
sysbench 0.5: multi-threaded system evaluation benchmark

Creating table 'sbtest1'...
Inserting 1000000 records into 'sbtest1'
...

$ ps aux|grep multi_msb_percona5_7_17|grep .sock|awk '{print "vsz: "$5/1024" MB","rss: "$6/1024" MB",$19}'
vsz: 756.836 MB rss: 383.094 MB --port=14418
vsz: 752.836 MB rss: 376.918 MB --port=14419
vsz: 1176.84 MB rss: 754.984 MB --port=14420

$ sysbench --num-threads=16 --max-requests=300000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=sbtest1 --mysql-user=msandbox --mysql-password=msandbox --oltp-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox14418.sock --test=/usr/share/doc/sysbench/tests/db/oltp.lua run
sysbench 0.5: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
...

$ ./use_all "select * from sys.memory_global_total";
# server: 1:
total_allocated
360.05 MiB
# server: 2:
total_allocated
343.64 MiB
# server: 3:
total_allocated
1.35 GiB

$ ps aux|grep multi_msb_percona5_7_17|grep .sock|awk '{print "vsz: "$5/1024" MB","rss: "$6/1024" MB",$19}'
vsz: 852.898 MB rss: 421 MB --port=14418
vsz: 752.836 MB rss: 375.461 MB --port=14419
vsz: 1988.84 MB rss: 1615.76 MB --port=14420

node3 [localhost] {msandbox} ((none)) > select event_name, high_number_of_bytes_used/1024/1024 from performance_schema.memory_summary_global_by_event_name order by high_number_of_bytes_used desc limit 10;
+-----------------------------------------------------------------------------+-------------------------------------+
| event_name | high_number_of_bytes_used/1024/1024 |
+-----------------------------------------------------------------------------+-------------------------------------+
| memory/sql/Log_event | 1041.87703991 |
| memory/innodb/buf_buf_pool | 133.25000000 |
| memory/innodb/log0log | 32.08451843 |
| memory/sql/XID | 19.00152588 |
| memory/performance_schema/events_statements_history_long | 13.65661621 |
| memory/performance_schema/events_statements_history_long.tokens | 9.76562500 |
| memory/performance_schema/events_statements_history_long.sqltext | 9.76562500 |
| memory/performance_schema/events_statements_summary_by_digest.tokens | 9.76562500 |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name | 9.11914063 |
| memory/performance_schema/table_handles | 9.06250000 |
+-----------------------------------------------------------------------------+-------------------------------------+
10 rows in set (0.00 sec)

node3 [localhost] {msandbox} ((none)) > select thread_id tid, user, current_count_used ccu, current_allocated ca, current_avg_alloc caa, current_max_alloc cma, total_allocated from sys.memory_by_thread_by_current_bytes limit 10;
+-----+--------------------------+---------+------------+-----------+------------+-----------------+
| tid | user | ccu | ca | caa | cma | total_allocated |
+-----+--------------------------+---------+------------+-----------+------------+-----------------+
| 28 | sql/slave_sql | 3810486 | 1.01 GiB | 285 bytes | 1.00 GiB | 17.68 GiB |
| 1 | sql/main | 3306 | 192.17 MiB | 59.52 KiB | 133.25 MiB | 227.30 MiB |
| 27 | sql/slave_sql | 392833 | 26.10 MiB | 70 bytes | 17.83 MiB | 27.50 GiB |
| 33 | msandbox@localhost | 83 | 774.75 KiB | 9.33 KiB | 256.00 KiB | 2.81 MiB |
| 26 | sql/slave_io | 542 | 614.54 KiB | 1.13 KiB | 528.01 KiB | 10.71 MiB |
| 29 | sql/slave_io | 533 | 611.96 KiB | 1.15 KiB | 528.01 KiB | 10.71 MiB |
| 25 | innodb/dict_stats_thread | 32 | 1.62 KiB | 52 bytes | 960 bytes | 6.61 MiB |
| 24 | innodb/buf_dump_thread | 2 | 64 bytes | 32 bytes | 64 bytes | 224.79 KiB |
| 30 | sql/signal_handler | 0 | 0 bytes | 0 bytes | 0 bytes | 0 bytes |
| 32 | sql/compress_gtid_table | 0 | 0 bytes | 0 bytes | 0 bytes | 0 bytes |
+-----+--------------------------+---------+------------+-----------+------------+-----------------+
10 rows in set (0.10 sec)

node3 [localhost] {msandbox} ((none)) > stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

node3 [localhost] {msandbox} ((none)) > select * from sys.memory_global_total;
+-----------------+
| total_allocated |
+-----------------+
| 1.35 GiB |
+-----------------+
1 row in set (0.00 sec)

node3 [localhost] {msandbox} ((none)) > select @@innodb_buffer_pool_size/1024/1024;
+-------------------------------------+
| @@innodb_buffer_pool_size/1024/1024 |
+-------------------------------------+
| 128.00000000 |
+-------------------------------------+
1 row in set (0.01 sec)

Przemek (pmalkowski) wrote :

This has been already reported as bug 1594964, marking as duplicate

Przemek (pmalkowski) wrote :

I think that this bug is actually not a duplicate of lp:1594964, although both have the same referenced upstream report.

The mem leak in this report happens only on a multi-source slave, and leak happens in memory/sql/Log_event area. While the bug in lp:1594964 is reproducible in simple master-master scenario, and leak happens in memory/sql/thd::main_mem_root area.
Although both issues seem similar and maybe even related in the code, as some conditions are the same, though the additional conditions that trigger the leaks are different in both cases.

tags: added: upstream
Przemek (pmalkowski) wrote :

More specific corresponding upstream bug report: https://bugs.mysql.com/bug.php?id=85371

Vlad Lesin (vlad-lesin) wrote :

What is happening here?

1) When binlog_rows_query_log_events is enabled Rows_query_log_event is created, written to binary log on master.

2) Slave is reading read this event from relay log and applying it, see Rows_query_log_event::do_apply_event(). When the event is applied the event itself is stored in Relay_log_info::rows_query_ev:

int Rows_query_log_event::do_apply_event(Relay_log_info const *rli)
{
  DBUG_ENTER("Rows_query_log_event::do_apply_event");
  DBUG_ASSERT(rli->info_thd == thd);
  /* Set query for writing Rows_query log event into binlog later.*/
  thd->set_query(m_rows_query, strlen(m_rows_query));

  DBUG_ASSERT(rli->rows_query_ev == NULL);

  const_cast<Relay_log_info*>(rli)->rows_query_ev= this;
  /* Tell worker not to free the event */
  worker= NULL;
  DBUG_RETURN(0);
}

3) Usually Relay_log_info::rows_query_ev is removed with the following stack trace:

#0 Rows_query_log_event::~Rows_query_log_event (this=0x7fff9401df00, __in_chrg=<optimized out>, __vtt_parm=<optimized out>)
    at ./sql/log_event.h:3795
#1 0x0000000001958d1e in Relay_log_info::cleanup_context (this=0x7fffa40413d0, thd=0x7fff94000950, error=false)
    at ./sql/rpl_rli.cc:1777
#2 0x00000000018d984b in rows_event_stmt_cleanup (rli=0x7fffa40413d0, thd=0x7fff94000950) at ./sql/log_event.cc:11297
#3 0x00000000018d9500 in Rows_log_event::do_apply_event (this=0x7fff94012fc0, rli=0x7fffa40413d0)
    at ./sql/log_event.cc:11175
#4 0x00000000018c00c7 in Log_event::apply_event (this=0x7fff94012fc0, rli=0x7fffa40413d0) at ./sql/log_event.cc:3371
#5 0x0000000001938e0f in apply_event_and_update_pos (ptr_ev=0x7ffff139e8a0, thd=0x7fff94000950, rli=0x7fffa40413d0)
    at ./sql/rpl_slave.cc:4750
#6 0x000000000193a5d7 in exec_relay_log_event (thd=0x7fff94000950, rli=0x7fffa40413d0) at ./sql/rpl_slave.cc:5265
#7 0x00000000019418eb in handle_slave_sql (arg=0x7fffa4139800) at ./sql/rpl_slave.cc:7461
#8 0x0000000001e6a679 in pfs_spawn_thread (arg=0x7fffa413d640) at ./storage/perfschema/pfs.cc:2188
#9 0x00007ffff71616ba in start_thread (arg=0x7ffff139f700) at pthread_create.c:333
#10 0x00007ffff65f682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

But if we look into Rows_log_event::do_apply_event() the function rows_event_stmt_cleanup() is called at the end:

end:
  if (get_flags(STMT_END_F))
  {
    if((error= rows_event_stmt_cleanup(rli, thd)))
    {

Vlad Lesin (vlad-lesin) wrote :
Download full text (3.8 KiB)

What is happening here?

1) When binlog_rows_query_log_events is enabled Rows_query_log_event is created, written to binary log on master prior .

2) Slave is reading read this event from relay log and applying it, see Rows_query_log_event::do_apply_event(). When the event is applied the event itself is stored in Relay_log_info::rows_query_ev:

int Rows_query_log_event::do_apply_event(Relay_log_info const *rli)
{
  DBUG_ENTER("Rows_query_log_event::do_apply_event");
  DBUG_ASSERT(rli->info_thd == thd);
  /* Set query for writing Rows_query log event into binlog later.*/
  thd->set_query(m_rows_query, strlen(m_rows_query));

  DBUG_ASSERT(rli->rows_query_ev == NULL);

  const_cast<Relay_log_info*>(rli)->rows_query_ev= this;
  /* Tell worker not to free the event */
  worker= NULL;
  DBUG_RETURN(0);
}

3) Usually Relay_log_info::rows_query_ev is removed with the following stack trace:

#0 Rows_query_log_event::~Rows_query_log_event (this=0x7fff9401df00, __in_chrg=<optimized out>, __vtt_parm=<optimized out>)
    at ./sql/log_event.h:3795
#1 0x0000000001958d1e in Relay_log_info::cleanup_context (this=0x7fffa40413d0, thd=0x7fff94000950, error=false)
    at ./sql/rpl_rli.cc:1777
#2 0x00000000018d984b in rows_event_stmt_cleanup (rli=0x7fffa40413d0, thd=0x7fff94000950) at ./sql/log_event.cc:11297
#3 0x00000000018d9500 in Rows_log_event::do_apply_event (this=0x7fff94012fc0, rli=0x7fffa40413d0)
    at ./sql/log_event.cc:11175
#4 0x00000000018c00c7 in Log_event::apply_event (this=0x7fff94012fc0, rli=0x7fffa40413d0) at ./sql/log_event.cc:3371
#5 0x0000000001938e0f in apply_event_and_update_pos (ptr_ev=0x7ffff139e8a0, thd=0x7fff94000950, rli=0x7fffa40413d0)
    at ./sql/rpl_slave.cc:4750
#6 0x000000000193a5d7 in exec_relay_log_event (thd=0x7fff94000950, rli=0x7fffa40413d0) at ./sql/rpl_slave.cc:5265
#7 0x00000000019418eb in handle_slave_sql (arg=0x7fffa4139800) at ./sql/rpl_slave.cc:7461
#8 0x0000000001e6a679 in pfs_spawn_thread (arg=0x7fffa413d640) at ./storage/perfschema/pfs.cc:2188
#9 0x00007ffff71616ba in start_thread (arg=0x7ffff139f700) at pthread_create.c:333
#10 0x00007ffff65f682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

But if we look into Rows_log_event::do_apply_event() the function rows_event_stmt_cleanup() is called at the end:

int Rows_log_event::do_apply_event(Relay_log_info const *rli)
{
...
end:
  if (get_flags(STMT_END_F))
  {
    if((error= rows_event_stmt_cleanup(rli, thd)))
    {
      ...
    }
    ...
  }
...
}

4) But when the statement must be skipped because it has been already applied Rows_log_event::do_apply_event() exits before rows_event_stmt_cleanup() call. See the following code:

int Rows_log_event::do_apply_event(Relay_log_info const *rli)
{
...
    enum_gtid_statement_status state= gtid_pre_statement_checks(thd);
    if (state == GTID_STATEMENT_EXECUTE)
    {
      if (gtid_pre_statement_post_implicit_commit_checks(thd))
        state= GTID_STATEMENT_CANCEL;
    }

    if (state == GTID_STATEMENT_CANCEL)
    {
      uint error= thd->get_stmt_da()->mysql_errno();
      DBUG_ASSERT(error != 0);
      rli->report(ERROR_LEVEL, error,
                  "Error executing row event: '%s'",
                 ...

Read more...

Vlad Lesin (vlad-lesin) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1066

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.