Execution of stored procedures with subqueries leaks memory

Bug #1380985 reported by Vojtech Kurka on 2014-10-14
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Vlad Lesin
5.1
Invalid
Undecided
Unassigned
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
High
Vlad Lesin

Bug Description

We have this problem:

with 50GB innodb buffer pool the server eats about 56GB of memory (RES) with the production workload – that’s OK. But then the memory consumption is increasing and after 3 days of runtime, it uses about 76GB of memory (RES) and the machine starts swapping out. I have to restart mysqld every 3 days.

We have the exact version of PerconaServer on the slave (without production workload, just the slave thread) and it doesn’t have this problem (54 GB mem RES after 7 days).

This problem is the same for a few recent versions of Percona server, not just the last one.

Server gets just normal OLTP traffic + mysqldump and xtrabackup nightly.
We DO use a few InnoDB tables with compression and we use some MyISAM tables with range partitioning.
TokuDB is not installed.

-- ----------------------------------------------

mysql> select version();
+-----------------+
| version() |
+-----------------+
| 5.6.21-69.0-log |
+-----------------+

CentOS release 6.4 (Final)
Linux xxxxx 2.6.32-358.14.1.el6.x86_64 #1 SMP Tue Jul 16 23:51:20 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

my.cnf:

[mysqld]

user=mysql
datadir=/home/users/our_project/mysql/data
tmpdir=/home/users/our_project/mysql/data
innodb_data_home_dir=/home/users/our_project/mysql/innodb
innodb_log_group_home_dir=/home/users/our_project/mysql/innodb

log-error=/var/log/mysqld.log

character-set-server=UTF8
collation_server=utf8_general_ci
init_connect='SET NAMES utf8'

sql-mode="STRICT_ALL_TABLES"

query_cache_type=0

local_infile=0
lock_wait_timeout=5
skip-name-resolve
explicit_defaults_for_timestamp=0

max_binlog_size=1000M
expire_logs_days=3
binlog-format=ROW
binlog_row_image=minimal
log-slave-updates
binlog_cache_size=512000

binlog-ignore-db=temp
replicate-wild-ignore-table=temp.%
replicate-wild-ignore-table=mysql.%

slave_type_conversions=ALL_NON_LOSSY

sync_binlog=1
sync_relay_log=100

relay-log-recovery=1
relay_log_info_repository=TABLE

max_connections=1000
max_connect_errors=10000

back_log=3000
thread_cache_size=100
max_allowed_packet=134217728

myisam-recover-options=FORCE
delay-key-write=OFF

table_open_cache=10000
table_definition_cache=2000

loose-innodb=FORCE
innodb_file_per_table
innodb_file_format=Barracuda
innodb_flush_method=O_DIRECT
innodb_open_files=10000
innodb_stats_on_metadata=0

innodb_log_file_size=256M
innodb_max_dirty_pages_pct=30
innodb_io_capacity=2000
innodb_io_capacity_max=10000

innodb_strict_mode=1
innodb_buffer_pool_size=50G
innodb_buffer_pool_dump_at_shutdown=1
innodb_buffer_pool_load_at_startup=1
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=64

innodb_purge_threads=1

innodb_stats_persistent=1
innodb_stats_persistent_sample_pages=500

innodb_checksum_algorithm=crc32

innodb_flush_neighbors=0

tmp_table_size=100M
max_heap_table_size=100M

event_scheduler=1

server-id=1
log-bin=/home/users/klarka/mysql/binlog/klarka1-bin
master-info-file=/home/users/klarka/mysql/info/klarka1-master.info

relay-log=/home/users/klarka/mysql/relay/klarka1-relay-bin
relay-log-info-file=/home/users/klarka/mysql/info/klarka1-relay.info

Thank you for the problem report. Please, send the outputs of:

show global status;
show engine innodb status\G

Also I wonder if memory use changes notably after you execute FLUSH TABLES statement.

Changed in percona-server:
status: New → Incomplete
Vojtech Kurka (vojtech-kurka) wrote :

 VIRT RES
 69.9G 59.4G
 68.2G 57.2G // after FLUSH TABLES

Vojtech Kurka (vojtech-kurka) wrote :
Changed in percona-server:
status: Incomplete → New
Vojtech Kurka (vojtech-kurka) wrote :

Today:

mysql> show status like 'uptime';
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| Uptime | 170481 |
+---------------+--------+

VIRT RES
 70.7G 60.7G
 70.3G 59.9G // after FLUSH TABLES

Vojtech Kurka (vojtech-kurka) wrote :

Today:

mysql> show status like 'uptime';
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| Uptime | 255891 |
+---------------+--------+

VIRT RES
 74.3G 66.2G
 72.1G 61.5G // after FLUSH TABLES

So, this is mostly related to a table cache content. It may also be related to temporary tables in memory, with these settings:

tmp_table_size=100M
max_heap_table_size=100M

Next time, please, try to capture the output of

show engine innodb status\G

just before and immediately after FLUSH TABLES. I want to check if we get any changes in InnoDB memory allocation.

Vojtech Kurka (vojtech-kurka) wrote :

On Friday, I set most of the variables to the default and restarted mysqld. These are the stats:

Monday
70.1G 63.0G
69.0G 60.6G // after FLUSH TABLES

Wednesday
71.9G 63.7G
71.2G 62.3G // after FLUSH TABLES

I'll attach the innodb status output.

Vojtech Kurka (vojtech-kurka) wrote :
Vojtech Kurka (vojtech-kurka) wrote :
Vojtech Kurka (vojtech-kurka) wrote :

Current config:

[mysqld]

user=mysql
datadir=/home/users/klarka/mysql/data
tmpdir=/home/users/klarka/mysql/data
innodb_data_home_dir=/home/users/klarka/mysql/innodb
innodb_log_group_home_dir=/home/users/klarka/mysql/innodb

log-error=/var/log/mysqld.log

character-set-server=UTF8
collation_server=utf8_general_ci
init_connect='SET NAMES utf8'

sql-mode="STRICT_ALL_TABLES"

local_infile=0
lock_wait_timeout=5
skip-name-resolve
explicit_defaults_for_timestamp=0

expire_logs_days=3
binlog-format=ROW
binlog_row_image=minimal
log-slave-updates
binlog_cache_size=512000

binlog-ignore-db=temp
replicate-wild-ignore-table=temp.%
replicate-wild-ignore-table=mysql.%

slave_type_conversions=ALL_NON_LOSSY

sync_binlog=1
sync_relay_log=100

relay-log-recovery=1
relay_log_info_repository=TABLE

max_connections=1000
max_connect_errors=10000

back_log=3000
thread_cache_size=100
max_allowed_packet=134217728

myisam-recover-options=FORCE
delay-key-write=OFF

loose-innodb=FORCE
innodb_file_format=Barracuda
innodb_flush_method=O_DIRECT

innodb_log_file_size=256M
innodb_max_dirty_pages_pct=30
innodb_io_capacity=10000
innodb_io_capacity_max=20000

innodb_strict_mode=1
innodb_buffer_pool_size=50G
innodb_buffer_pool_dump_at_shutdown=1
innodb_buffer_pool_load_at_startup=1
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=64

innodb_stats_persistent_sample_pages=500
innodb_checksum_algorithm=crc32
innodb_flush_neighbors=0

event_scheduler=1

server-id=1
log-bin=/home/users/project/mysql/binlog/project-bin
master-info-file=/home/users/project/mysql/info/project-master.info

relay-log=/home/users/project/mysql/relay/project-relay-bin
relay-log-info-file=/home/users/project/mysql/info/project-relay.info

Vojtech Kurka (vojtech-kurka) wrote :

One week later:

Wed 29 Oct

VIRT RES
99.4G 90.2G
91.8G 81.4G // after FLUSH TABLES

Nick Lott (zick) wrote :

We have a very similar sounding problem, with similar, though lower, memory and settings.

We've tried to isolate the cause by process of elimination but failed so far. We've reduced memory settings so that various reckoners (mysqltuner etc) say max mem usage is about 60% of available physical ram, but usage continues to creep up until after 3-4 days it starts to swap and we must restart mysql before the oom killer gets it.

I suspect this is an upstream problem (or set of problems) given the number of people reporting similar things for mysql 5.6, but I though I would let you know you're not alone. And it is very frustrating.

tags: added: i48579
Peiran Song (peiran-song) wrote :

i48579 reported a potential memory leak on Percona server 5.6.21. The workload was primarily stored procedures. Massif output from a test server was provided in the issue, which didn't show obvious problem. It would be helpful if any developer would give it a check as well.

Peiran Song (peiran-song) wrote :
Download full text (4.8 KiB)

For i48579, the massif output showed that add_derived_key()/update_derived_keys() function had an increased memory consumption for the most part. Thus, the customer changed a query in the stored procedure to use temporary table instead of subquery(using derived table). The stress test on a small test server showed stable memory usage for over 12+ hours.

The max used memory by this function was 30M in the massif output. Due to the small scale of the test, we know now that this function is part of the problem but unknown yet if it is the main problem.

Here is the related massif output for reference:

| | | ->17.34% (29,906,400B) 0x88509D: add_derived_key(List<Derived_key>&, Field*, unsigned long long) (table.cc:6153)
| | | | ->11.57% (19,951,200B) 0x885342: TABLE_LIST::update_derived_keys(Field*, Item**, unsigned int) (table.cc:6217)
| | | | | ->11.57% (19,951,200B) 0x9B64DE: add_key_field(Key_field**, unsigned int, Item_func*, Field*, bool, Item**, unsigned int, unsigned long long, st_sargable_param**) (sql_optimizer.cc:4885)
| | | | | ->11.57% (19,951,200B) 0x9B6BD8: add_key_equal_fields(Key_field**, unsigned int, Item_func*, Item_field*, bool, Item**, unsigned int, unsigned long long, st_sargable_param**) (sql_optimizer.cc:5068)
| | | | | ->11.57% (19,951,200B) 0x9B76F3: add_key_fields(JOIN*, Key_field**, unsigned int*, Item*, unsigned long long, st_sargable_param**) (sql_optimizer.cc:5280)
| | | | | ->11.57% (19,951,200B) 0x9B7F5D: update_ref_and_keys(THD*, Mem_root_array<Key_use, true>*, st_join_table*, unsigned int, Item*, COND_EQUAL*, unsigned long long, st_select_lex*, st_sargable_param**) (sql_optimizer.cc:5891)
| | | | | ->11.57% (19,951,200B) 0x9B9243: make_join_statistics(JOIN*, TABLE_LIST*, Item*, Mem_root_array<Key_use, true>*, bool) (sql_optimizer.cc:3329)
| | | | | ->11.57% (19,951,200B) 0x9BF50B: JOIN::optimize() (sql_optimizer.cc:363)
| | | | | ->11.57% (19,951,200B) 0x7FF5E2: mysql_execute_select(THD*, st_select_lex*, bool) (sql_select.cc:1086)
| | | | | ->11.57% (19,951,200B) 0x7FFFE9: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, SQL_I_List<st_order>*, SQL_I_List<st_order>*, Item*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:1221)
| | | | | ->11.57% (19,951,200B) 0x8001E5: handle_select(THD*, select_result*, unsigned long) (sql_select.cc:110)
| | | | | ->11.57% (19,951,200B) 0x7C63BA: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5103)
| | | | | ->11.57% (19,951,200B) 0x7C78E5: mysql_execute_command(THD*) (sql_parse.cc:2649)
| | | | | ->11.57% (19,951,200B) 0x789EF8: mysql_open_cursor(THD*, select_result*, Server_side_cursor**) (sql_cursor.cc:118)
| | | | | ->11.57% (19,951,200B) 0x725EFD: sp_cursor::open(THD*) (sp_rcontext.cc:493)
| | | | | ->11.57% (19,951,200B) 0x97FAB8: sp_instr_cpush::exec_core(THD*, unsigned int*) (sp_instr.cc:1471)
| | | | | ->11.57% (19,951,200B) 0x981657: sp_lex_instr::reset_lex_and_exec_core(THD*, unsigned int*, b...

Read more...

Peiran Song (peiran-song) wrote :

Test data:
-----------------------
Download at: https://www.dropbox.com/s/wi4dsk8kaw96i1m/testcase_48579.tar.gz?dl=0

Query type: stored procedure with queries that use derived table(s).

Test result:
--------------------

1) with stock MySQL 5.6.21, MySQL 5.6.22
Steady memory increase at 3~5M per minute

2) with PS MySQL 5.6.21
Steady memory increase at 20M per minute

Attach two massif output for reference.

Peiran Song (peiran-song) wrote :
Peiran Song (peiran-song) wrote :

Forgot to acknowledge that test data is provided by customer of i48579.

Vlad Lesin (vlad-lesin) wrote :
Download full text (17.5 KiB)

This is indeed memory leak. I have repeat it with the test case provided by the customer.

The analyzes is based on this code https://github.com/percona/percona-server/commit/a68f179005da7c11864f406d5c7fb0df48a60b69 .

============================<1>=================================
The instruction (sp_instr_cpush) mem_root is set to persistent memory root (main_mem_root) during SP compilation

bool sp_head::add_instr(THD *thd, sp_instr *instr)
{
...
  /*
    Memory root of every instruction is designated for permanent
    transformations (optimizations) made on the parsed tree during
    the first execution. It points to the memory root of the
    entire stored procedure, as their life span is equal.
  */
  instr->mem_root= get_persistent_mem_root();
...
}

(gdb) bt
#0 sp_instr_cpush::sp_instr_cpush (this=0x7fffc0059a10, ip=4, ctx=0x7fffc0047640, cursor_lex=0x7fffc0053800, cursor_query=..., cursor_idx=0)
    at ./sql/sp_instr.h:1209
#1 0x00000000008d3c43 in MYSQLparse (YYTHD=0x1a6ea70) at ./sql/sql_yacc.yy:3277
#2 0x00000000007e2e58 in parse_sql (thd=0x1a6ea70, parser_state=0x7ffff40a7b50, creation_ctx=0x7fffc0005890)
    at ./sql/sql_parse.cc:8935
#3 0x0000000000724731 in sp_compile (thd=0x1a6ea70, defstr=0x7ffff40a7de0, sql_mode=1073741824, creation_ctx=0x7fffc0005890)
    at ./sql/sp.cc:774
#4 0x0000000000724c54 in db_load_routine (thd=0x1a6ea70, type=SP_TYPE_PROCEDURE, name=0x7ffff40a9690, sphp=0x7ffff40a98b8, sql_mode=1073741824,
    params=0x7fffc0005158 "\n IN p_user_id BIGINT, \n OUT p_attributes VARCHAR(32767)", returns=0xf89290 "",
    body=0x7fffc00051a0 "mainpgm: BEGIN \n\nDECLARE v_continue_flag INT;\nDECLARE c_delim_pipe VARCHAR(10) DEFAULT '|';\nDECLARE c_delim_equal VARCHAR(10) DEFAULT '=';\nDECLARE v_not_found_attribute VARCH"..., chistics=..., definer=0x7fffc0005830 "sp_user@%",
    created=20150309002751, modified=20150309002751, creation_ctx=0x7fffc0005890) at ./sql/sp.cc:907
#5 0x0000000000724470 in db_find_routine (thd=0x1a6ea70, type=SP_TYPE_PROCEDURE, name=0x7ffff40a9690, sphp=0x7ffff40a98b8)
    at ./sql/sp.cc:688
#6 0x00000000007282f9 in sp_cache_routine (thd=0x1a6ea70, type=SP_TYPE_PROCEDURE, name=0x7ffff40a9690, lookup_only=false, sp=0x7ffff40a98b8)
    at ./sql/sp.cc:2077
#7 0x000000000072812a in sp_cache_routine (thd=0x1a6ea70, rt=0x7fffc0004d90, lookup_only=false, sp=0x7ffff40a98b8)
    at ./sql/sp.cc:2029
#8 0x0000000000767283 in open_and_process_routine (thd=0x1a6ea70, prelocking_ctx=0x1a70fc0, rt=0x7fffc0004d90,
    prelocking_strategy=0x7ffff40a9ac0, has_prelocking_list=false, ot_ctx=0x7ffff40a99d0, need_prelocking=0x7ffff40a9970,
    routine_modifies_data=0x7ffff40a996f) at ./sql/sql_base.cc:4481
#9 0x00000000007688ba in open_tables (thd=0x1a6ea70, start=0x7ffff40a9a40, counter=0x7ffff40a9a54, flags=0, prelocking_strategy=0x7ffff40a9ac0)
    at ./sql/sql_base.cc:5299
#10 0x000000000076975c in open_and_lock_tables (thd=0x1a6ea70, tables=0x0, derived=true, flags=0, prelocking_strategy=0x7ffff40a9ac0)
    at ./sql/sql_base.cc:5882
#11 0x000000000075b7b4 in open_and_lock_tables (thd=0x1a6ea70, tables=0x0, derived=true, flags=0)
    at ./sql/sql_base.h:477
#12 0...

tags: added: upstream
summary: - Possible memory leak
+ Execution of stored procedures with subqueries leaks memory
Vojtech Kurka (vojtech-kurka) wrote :

I confirm that the fix works! Thank you guys, it's awesome that you make mysql better.

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-826

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.