Querying GLOBAL_TEMPORARY_TABLES crashes if temp-table owning threads execute new queries

Bug #1581949 reported by Arturas Moskvinas
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Fix Released
High
Laurynas Biveinis
5.6
Fix Released
High
Laurynas Biveinis
5.7
Fix Released
High
Laurynas Biveinis

Bug Description

One of our production DB server was constantly crashing. Service is extensively using temporary tables in sessions and randomly server would crash if we were selecting information from GLOBAL_TEMPORARY_TABLES view.
Select:
SELECT engine, data_length, index_length FROM GLOBAL_TEMPORARY_TABLES;

How to reproduce issue:
1. In loop start selecting data from GLOBAL_TEMPORARY_TABLES
2. In second session - start creating lots of temporary tables

Simple script (sorry not mysql test suite):
1. In one session start such simple select (better reproducible if several sessions do the same):
watch -n 0.1 "mysql information_schema -e 'SELECT engine, data_length, index_length FROM GLOBAL_TEMPORARY_TABLES'"

2. In another session start creating lots of temporary tables (tested with 300, but usually mysql crashes faster, from time to time creation has to be repeated):
```
echo "Create temporary table a1(a varchar(256));Create temporary table a2(a varchar(256));Create temporary table a3(a varchar(256));Create temporary table a4(a varchar(256));Create temporary table a5(a varchar(256));Create temporary table a6(a varchar(256));Create temporary table a7(a varchar(256));Create temporary table a8(a varchar(256));Create temporary table a9(a varchar(256));Create temporary table a10(a varchar(256));Create temporary table a11(a varchar(256));Create temporary table a12(a varchar(256));Create temporary table a13(a varchar(256));Create temporary table a14(a varchar(256));Create temporary table a15(a varchar(256));Create temporary table a16(a varchar(256));Create temporary table a17(a varchar(256));Create temporary table a18(a varchar(256));Create temporary table a19(a varchar(256));Create temporary table a20(a varchar(256));Create temporary table a21(a varchar(256));Create temporary table a22(a varchar(256));Create temporary table a23(a varchar(256));Create temporary table a24(a varchar(256));Create temporary table a25(a varchar(256));Create temporary table a26(a varchar(256));...." | mysql test
```

Mysql error log:
```
11:29:47 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=11
max_threads=4098
thread_count=8
connection_count=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1646765 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fec12b09000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fec690dae48 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8ba78c]
/usr/sbin/mysqld(handle_fatal_signal+0x469)[0x646369]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fec6b4a48d0]
/usr/sbin/mysqld(_Z21open_table_from_shareP3THDP11TABLE_SHAREPKcjjjP5TABLEb+0x3b2)[0x7464f2]
/usr/sbin/mysqld(_Z19open_table_uncachedP3THDPKcS2_S2_bb+0x188)[0x67ab18]
/usr/sbin/mysqld[0x7144b7]
/usr/sbin/mysqld(_Z26mysql_create_table_no_lockP3THDPKcS2_P24st_ha_create_informationP10Alter_infojPb+0xf8)[0x714ad8]
/usr/sbin/mysqld(_Z18mysql_create_tableP3THDP10TABLE_LISTP24st_ha_create_informationP10Alter_info+0xcc)[0x714c6c]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4f80)[0x6c8c50]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x5d8)[0x6c9dd8]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x109b)[0x6cb52b]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x6981a2]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x698240]
/usr/sbin/mysqld(pfs_spawn_thread+0x143)[0x8f5a83]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fec6b49d0a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fec6946f87d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7febd541d010): is an invalid pointer
Connection ID (thread ID): 430
Status: NOT_KILLED
```

We also have core dump, stack trace from core dump is in attached file.

We also have core dump from original crash, stack trace is actually different and is caused by null reference during `SELECT engine, data_length, index_length FROM GLOBAL_TEMPORARY_TABLES` execution

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

We're using percona 5.6.29-76.2 (latest 5.6 at the moment)

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Stack trace from original crash, not sure if it is actually related to my repeatable case or totally unrelated and caused by another issue.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Please post the stacktraces for all the threads from your core dump ("thread apply all bt" in gdb)

Changed in percona-server:
status: New → Incomplete
tags: added: i-s-temp-tables
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

full stack #2 crash must be due to I_S query catching Thread 26 at a wrong time

Thread 26 (Thread 0x7f09f47f5700 (LWP 115689)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f09f42c84ef in _L_lock_1081 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f09f42c8468 in __GI___pthread_mutex_lock (mutex=0x7ef1be3db0e0) at ../nptl/pthread_mutex_lock.c:134
#3 0x000000000067b077 in inline_mysql_mutex_lock (src_line=2194, src_file=0xb812a0 "/mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_base.cc", that=0x7ef1be3db0e0) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/include/mysql/psi/mysql_thread.h:688
#4 close_temporary_table (thd=thd@entry=0x7ef1be3db000, table=table@entry=0x7ef1bfceec00, free_share=free_share@entry=true, delete_table=delete_table@entry=true) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_base.cc:2194
#5 0x000000000067b12f in drop_temporary_table (thd=thd@entry=0x7ef1be3db000, table_list=table_list@entry=0x7ef1bffeb2a0, is_trans=is_trans@entry=0x7f09f47f1072) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_base.cc:2177
#6 0x000000000071af22 in mysql_rm_table_no_locks (thd=thd@entry=0x7ef1be3db000, tables=tables@entry=0x7ef1bffeb2a0, if_exists=true, drop_temporary=false, drop_view=drop_view@entry=false, dont_log_query=dont_log_query@entry=false) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_table.cc:2387
#7 0x000000000071c32e in mysql_rm_table (thd=thd@entry=0x7ef1be3db000, tables=tables@entry=0x7ef1bffeb2a0, if_exists=<optimized out>, drop_temporary=<optimized out>) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_table.cc:2224
#8 0x00000000006c64c8 in mysql_execute_command (thd=thd@entry=0x7ef1be3db000) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sql_parse.cc:4105
#9 0x00000000007e9d13 in sp_instr_stmt::exec_core (this=0x7ef1bffe3ad8, thd=0x7ef1be3db000, nextp=0x7f09f47f2384) at /mnt/workspace/percona-server-5.6-debian-binary/label_exp/debian-jessie-64bit/percona-server-5.6-5.6.29-76.2/sql/sp_instr.cc:953
...

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

In full stack #1, I_S query has already completed and left trashed state for the temp table-creating query.

This should be enough info to fix case #2. Case #1 will be seen if there's enough data.

Changed in percona-server:
status: Incomplete → Triaged
importance: Undecided → High
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

So it is actually two different crashes involving similar functionality?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

At the moment that's how it seems.

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Btw, were you able to reproduce crash? Because test case #1 - is the one which I can quite constantly reproduce...

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Download full text (3.6 KiB)

MTR testcase:

delimiter |;

CREATE PROCEDURE query_temp_tables ()
  WHILE TRUE DO
    SELECT * FROM INFORMATION_SCHEMA.GLOBAL_TEMPORARY_TABLES;
  END WHILE|

delimiter ;|

connect (con1,localhost,root,,);
--let $con1_id= `SELECT connection_id()`
send CALL query_temp_tables();

connection default;

--let $i=100
while ($i)
{
  --eval CREATE TEMPORARY TABLE tmp_$i (a VARCHAR(256))
  --dec $i
}

--echo KILL QUERY \$con1_id
--disable_query_log
eval KILL QUERY $con1_id;
--enable_query_log

connection con1;
--error ER_QUERY_INTERRUPTED
reap;
disconnect con1;

connection default;

DROP PROCEDURE query_temp_tables;

Current 5.6 trunk with AddressSanitizer gives:

Version: '5.6.30-76.3-debug-log' socket: '/Users/laurynas/percona/obj-percona-5.6-asan/mysql-test/var/tmp/mysqld.1.sock' port: 13001 MySQL Community Server (GPL)
ASAN:DEADLYSIGNAL
=================================================================
==53577==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00010da1b521 bp 0x700000975790 sp 0x700000975640 T21)
    #0 0x10da1b520 in ha_myisam::open(char const*, int, unsigned int) ha_myisam.cc:750
    #1 0x10ce1dce2 in handler::ha_open(TABLE*, char const*, int, int) handler.cc:2744
    #2 0x10ce1d915 in handler::clone(char const*, st_mem_root*) handler.cc:2658
    #3 0x10da1b21d in ha_myisam::clone(char const*, st_mem_root*) ha_myisam.cc:654
    #4 0x10d2f53ee in store_temporary_table_record(THD*, TABLE*, TABLE*, char const*) sql_show.cc:4175
    #5 0x10d2f4b92 in fill_global_temporary_tables(THD*, TABLE_LIST*, Item*) sql_show.cc:4266
    #6 0x10d30ad65 in do_fill_table(THD*, TABLE_LIST*, st_join_table*) sql_show.cc:8013
    #7 0x10d30a91c in get_schema_tables_result(JOIN*, enum_schema_table_state) sql_show.cc:8115
    #8 0x10d2bb6cb in JOIN::prepare_result(List<Item>**) sql_select.cc:823
    #9 0x10d17b606 in JOIN::exec() sql_executor.cc:116
    #10 0x10d2bd343 in mysql_execute_select(THD*, st_select_lex*, bool) sql_select.cc:1101
    #11 0x10d2ba759 in mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, SQL_I_List<st_order>*, SQL_I_List<st_order>*, Item*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) sql_select.cc:1222
    #12 0x10d2ba1b1 in handle_select(THD*, select_result*, unsigned long) sql_select.cc:101
    #13 0x10d234926 in execute_sqlcom_select(THD*, TABLE_LIST*) sql_parse.cc:5683
    #14 0x10d2272f1 in mysql_execute_command(THD*) sql_parse.cc:3005
    #15 0x10d07f1a6 in sp_instr_stmt::exec_core(THD*, unsigned int*) sp_instr.cc:953
    #16 0x10d07baf6 in sp_lex_instr::reset_lex_and_exec_core(THD*, unsigned int*, bool) sp_instr.cc:400
    #17 0x10d07cf55 in sp_lex_instr::validate_lex_and_execute_core(THD*, unsigned int*, bool) sp_instr.cc:651
    #18 0x10d07d979 in sp_instr_stmt::execute(THD*, unsigned int*) sp_instr.cc:841
    #19 0x10d072ee4 in sp_head::execute(THD*, bool) sp_head.cc:648
    #20 0x10d0764f3 in sp_head::execute_procedure(THD*, List<Item>*) sp_head.cc:1321
    #21 0x10d22ef04 in mysql_execute_command(THD*) sql_parse.cc:5118
    #22 0x10d2236c5 in mysql_parse(THD*, char*, unsigned int, Parser_state*) sql_parse.cc:6968
    #23 0x10d21de16 in dispatch_command(e...

Read more...

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

5.5 appears to survive the testcase

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Artūras, what is the temp table engine in your tests? Currently I have managed to crash it only with MyISAM.

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Innodb in both cases:
```
default-storage-engine=innodb
```

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Can you please double-check your default-tmp-storage-engine setting too as that one overrides default-storage-engine for temp tables?

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

It is also innodb, but we do not set it explicitly in configuration (my.cnf), seems to inherits value from default_storage_engine.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Download full text (3.9 KiB)

Valgrind on the same testcase

==19862== Thread 21:
==19862== Invalid write of size 8
==19862== at 0xBAAB01: ha_myisam::open(char const*, int, unsigned int) (ha_myisam.cc:748)
==19862== by 0x658CB9: handler::ha_open(TABLE*, char const*, int, int) (handler.cc:2744)
==19862== by 0x658840: handler::clone(char const*, st_mem_root*) (handler.cc:2658)
==19862== by 0xBAA966: ha_myisam::clone(char const*, st_mem_root*) (ha_myisam.cc:655)
==19862== by 0x86ECE2: store_temporary_table_record(THD*, TABLE*, TABLE*, char const*) (sql_show.cc:4175)
==19862== by 0x86F286: fill_global_temporary_tables(THD*, TABLE_LIST*, Item*) (sql_show.cc:4266)
==19862== by 0x87F61F: do_fill_table(THD*, TABLE_LIST*, st_join_table*) (sql_show.cc:8013)
==19862== by 0x87FA35: get_schema_tables_result(JOIN*, enum_schema_table_state) (sql_show.cc:8115)
==19862== by 0x84DC11: JOIN::prepare_result(List<Item>**) (sql_select.cc:823)
==19862== by 0x7E335B: JOIN::exec() (sql_executor.cc:116)
==19862== by 0x84E879: mysql_execute_select(THD*, st_select_lex*, bool) (sql_select.cc:1101)
==19862== by 0x84EBB4: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, SQL_I_List<st_order>*, SQL_I_List<st_order>*, Item*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:1222)
==19862== by 0x84C884: handle_select(THD*, select_result*, unsigned long) (sql_select.cc:110)
==19862== by 0x8204D3: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5683)
==19862== by 0x81854F: mysql_execute_command(THD*) (sql_parse.cc:3005)
==19862== by 0x9F2397: sp_instr_stmt::exec_core(THD*, unsigned int*) (sp_instr.cc:953)
==19862== Address 0x15a40f58 is 1,064 bytes inside a block of size 1,168 free'd
==19862== at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==19862== by 0xB2FED2: my_free (my_malloc.c:140)
==19862== by 0xB26C79: free_root (my_alloc.c:373)
==19862== by 0x815FC1: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1899)
==19862== by 0x813710: do_command(THD*) (sql_parse.cc:1053)
==19862== by 0x7D6687: do_handle_one_connection(THD*) (sql_connect.cc:1541)
==19862== by 0x7D611F: handle_one_connection (sql_connect.cc:1444)
==19862== by 0xB989B2: pfs_spawn_thread (pfs.cc:1860)
==19862== by 0x5F656F9: start_thread (pthread_create.c:333)
==19862== by 0x6B22B5C: clone (clone.S:109)
==19862== Block was alloc'd at
==19862== at 0x4C2DB8F: malloc (vg_replace_malloc.c:299)
==19862== by 0xB2FA42: my_malloc (my_malloc.c:38)
==19862== by 0xB26357: alloc_root (my_alloc.c:173)
==19862== by 0x644476: Sql_alloc::operator new(unsigned long, st_mem_root*) (sql_alloc.h:40)
==19862== by 0xBA847B: myisam_create_handler(handlerton*, TABLE_SHARE*, st_mem_root*) (ha_myisam.cc:137)
==19862== by 0x653C56: get_new_handler(TABLE_SHARE*, st_mem_root*, handlerton*) (handler.cc:476)
==19862== by 0x65876E: handler::clone(char const*, st_mem_root*) (handler.cc:2636)
==19862== by 0xBAA966: ha_myisam::clone(char const*, st_mem_root*) (ha_myisam.cc:655)
==19862== by 0x86ECE2: store_temporary_table_record(THD*, TABLE*, TABLE*, char const*) (sql_show.cc:4175)
==1...

Read more...

summary: - Reproducible Percona Server 5.6 by selecting data from
- GLOBAL_TEMPORARY_TABLES when Temporary Tables are created
+ Querying GLOBAL_TEMPORARY_TABLES crashes if temp-table owning threads
+ execute new queries
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Just wondering is both crashes are fixed by your fix?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Never in my reproduction attempts I saw stacktraces resembling your case #1. I have added the equivalent of your reproduction script as a stress testcase to our codebase, and it hasn't crashed not once yet after fixing case #2.

Thus there's chance your case #1 is not fixed yet and we will iterate accordingly.

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Ok let see how it goes. Thanks!

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-982

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.