handle_fatal_signal (sig=11) in handler::clone at sql/handler.cc:2370 | crash with concurrent I_S.TEMPORARY_TABLES query and online ALTER

Bug #1193308 reported by Roel Van de Paar on 2013-06-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Triaged
High
Laurynas Biveinis
5.1
Won't Fix
Low
Laurynas Biveinis
5.5
Invalid
Low
Laurynas Biveinis
5.6
Triaged
High
Laurynas Biveinis

Bug Description

Thread 11 (LWP 11898):
+bt
#0 0x0000003fde20c69c in pthread_kill () from /lib64/libpthread.so.0
#1 0x00000000006e711d in handle_fatal_signal (sig=11) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/signal_handler.cc:251
#2 <signal handler called>
#3 0x0000000000640ddc in handler::clone (this=0x1a5a6820, name=0x1a7b89e0 "./query011880/#sql-2d65_65", mem_root=0x18d818a8) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/handler.cc:2370
#4 0x00000000009e4fae in ha_innobase::clone (this=0x1a5a6820, name=0x1a7b89e0 "./query011880/#sql-2d65_65", mem_root=0x18d818a8) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/handler/ha_innodb.cc:5352
#5 0x00000000007a0040 in store_temporary_table_record (thd=0x18d7e2a0, table=0x21739aa0, tmp_table=0x1a7b7de0, db=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_show.cc:3950
#6 0x00000000007a0391 in fill_global_temporary_tables (thd=0x194bdf90, tables=0x18bff128, cond=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_show.cc:4035
#7 0x00000000007a04df in fill_temporary_tables (thd=0x194bdf90, tables=0x18bff128, cond=0x0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_show.cc:4069
#8 0x000000000079af4d in do_fill_table (thd=0x194bdf90, table_list=0x18bff128, join_table=0x1a7265c0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_show.cc:7684
#9 0x000000000079b236 in get_schema_tables_result (join=0x18c00528, executed_place=PROCESSED_BY_JOIN_EXEC) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_show.cc:7785
#10 0x0000000000793135 in JOIN::prepare_result (this=0x18c00528, columns_list=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_select.cc:823
#11 0x000000000074b6aa in JOIN::exec (this=0x18c00528) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_executor.cc:116
#12 0x0000000000793548 in mysql_execute_select (thd=0x194bdf90, select_lex=0x194c0a58, free_join=true) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_select.cc:1101
#13 0x0000000000793ca4 in mysql_select (thd=0x194bdf90, tables=0x18bff128, wild_num=0, fields=..., conds=<optimized out>, order=0x194c0c20, group=0x194c0b58, having=0x0, select_options=2684652288, result=0x18c00500, unit=0x194c0410, select_lex=0x194c0a58) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_select.cc:1222
#14 0x0000000000793e78 in handle_select (thd=0x194bdf90, result=0x18c00500, setup_tables_done_option=0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_select.cc:111
#15 0x0000000000768dad in execute_sqlcom_select (thd=0x194bdf90, all_tables=0x18bff128) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:5280
#16 0x000000000076b307 in mysql_execute_command (thd=0x194be078) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:2749
#17 0x0000000000770408 in mysql_parse (thd=0x194bdf90, rawbuf=0x194bdf98 "\240\244?\001", length=<optimized out>, parser_state=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:6451
#18 0x0000000000770ede in dispatch_command (command=<optimized out>, thd=0x194bdf90, packet=<optimized out>, packet_length=415231440) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:1376
#19 0x0000000000771b60 in do_command (thd=0x194bdf90) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:1051
#20 0x000000000073c48f in do_handle_one_connection (thd_arg=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_connect.cc:1615
#21 0x000000000073c550 in handle_one_connection (arg=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_connect.cc:1526
#22 0x00000000009c8e59 in pfs_spawn_thread (arg=0x20649f80) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/perfschema/pfs.cc:1853
#23 0x0000003fde207851 in start_thread () from /lib64/libpthread.so.0
#24 0x0000003fddee890d in clone () from /lib64/libc.so.6

Related branches

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

Observe Valgrind in error log.

Roel Van de Paar (roel11) wrote :

Looks like handler is seriously broken. Adding yet another one (+ see bug 1193264):

#4 0x0000003fdde34085 in abort () from /lib64/libc.so.6
#5 0x0000003fdde2ba1e in __assert_fail_base () from /lib64/libc.so.6
#6 0x0000003fdde2bae0 in __assert_fail () from /lib64/libc.so.6
#7 0x000000000063994e in handler::ha_thd (this=0x210367c0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/handler.cc:2414

Adding traces and error log.

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
tags: added: i-s-temp-tables
Download full text (4.0 KiB)

The handler is OK, it is INFORMATION_SCHEMA.TEMPORARY_TABLES queries which are not OK.

In thread 11 it must be table->s NULL dereference in handler::clone(). According to the temp name there, it is thread 4 that uses that table in parallel. Its stacktrace should be enough for a DEBUG_SYNC testcase and fix.

Thread 4 (LWP 11914):
+bt
#0 0x0000003fde20b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000a4aaa3 in os_cond_wait (cond=<optimized out>, fast_mutex=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/os/os0sync.cc:214
#2 0x0000000000a4ab6b in os_event_wait_low (event=0x11da4af0, reset_sig_count=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/os/os0sync.cc:610
#3 0x0000000000ae40d5 in sync_array_wait_event (arr=0x8c26dc0, index=2) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/sync/sync0arr.cc:424
#4 0x0000000000ae5dd1 in rw_lock_x_lock_func (lock=0x1588360 <dict_operation_lock>, pass=0, file_name=0xd86b20 "/bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/handler/handler0alter.cc", line=5359) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/sync/sync0rw.cc:710
#5 0x0000000000aa23a0 in pfs_rw_lock_x_lock_func (lock=0x1588360 <dict_operation_lock>, pass=0, file_name=0xd86b20 "/bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/handler/handler0alter.cc", line=5359) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/include/sync0rw.ic:594
#6 0x0000000000aa246d in row_mysql_lock_data_dictionary_func (trx=0x1994b2d0, file=<optimized out>, line=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/row/row0mysql.cc:2096
#7 0x00000000009f3b03 in ha_innobase::commit_inplace_alter_table (this=0x20d109e0, altered_table=0x1a7b7de0, ha_alter_info=0x21e5acd0, commit=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/handler/handler0alter.cc:5359
#8 0x00000000007bd133 in mysql_inplace_alter_table (thd=0x18d7e2a0, table_list=0x20abfe88, table=0x20cb1e30, altered_table=0x1a7b7de0, ha_alter_info=0x21e5acd0, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x21e5ab20, alter_ctx=0x21e59500) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_table.cc:6527
#9 0x00000000007c13fc in mysql_alter_table (thd=0x18d7e2a0, new_db=0x21e5ab20 "@\253\345!", new_name=0x21e5acd0 "\220\263\345!", create_info=0x21e5b390, table_list=0x20abfe88, alter_info=0x21e5b470, order_num=0, order=0x0, ignore=false) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_table.cc:8369
#10 0x00000000008b0b9c in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x18d7e2a0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_alter.cc:317
#11 0x000000000076ae9f in mysql_execute_command (thd=0x18d7e388) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:5132
#12 0x0000000000770408 in mysql_parse (thd=0x18d7e2a0, rawbuf=0x18d7e2a8 "\240\244?\001", length=<optimized out>, parser_state=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/sql_parse.cc:6451
#13 0x0000000000770ede in dispatch...

Read more...

tags: added: xtradb
summary: - handle_fatal_signal (sig=11) in handler::clone at sql/handler.cc:2370
+ handle_fatal_signal (sig=11) in handler::clone at sql/handler.cc:2370 |
+ crash with concurrent I_S.TEMPORARY_TABLES query and online ALTER

Comments #5-#8 are unrelated, they look like an upstream bug, and have been moved to bug 1193338.

Roel Van de Paar (roel11) wrote :

Adding another occurence

#1 0x00000000006ecc9d in handle_fatal_signal (sig=11) at /bzr/5.6_val/Percona-Server-5.6.12-rc60.4/sql/signal_handler.cc:251
#2 <signal handler called>
#3 0x000000000064598c in handler::clone (this=0x1a2c37f0, name=0x20718278 "./test/#sql-61a7_e", mem_root=0x1a0efed8) at /bzr/5.6_val/Percona-Server-5.6.12-rc60.4/sql/handler.cc:2375
#4 0x0000000000a014fe in ha_innobase::clone (this=0x1a2c37f0, name=0x20718278 "./test/#sql-61a7_e", mem_root=0x1a0efed8) at /bzr/5.6_val/Percona-Server-5.6.12-rc60.4/storage/innobase/handler/ha_innodb.cc:5344
#5 0x00000000007ac5e0 in store_temporary_table_record (thd=0x1a0ec8b0, table=0x8b15270, tmp_table=0x20717680, db=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.12-rc60.4/sql/sql_show.cc:3958

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

From error log of #11, #12, #13

==24999== Thread 28:
==24999== Invalid read of size 8
==24999== at 0x64598C: handler::clone(char const*, st_mem_root*) (handler.cc:2375)
==24999== by 0xA014FD: ha_innobase::clone(char const*, st_mem_root*) (ha_innodb.cc:5344)
==24999== by 0x7AC5DF: store_temporary_table_record(THD*, TABLE*, TABLE*, char const*) (sql_show.cc:3958)
==24999== by 0x7AC930: fill_global_temporary_tables(THD*, TABLE_LIST*, Item*) (sql_show.cc:4043)
==24999== by 0x7A74CC: do_fill_table(THD*, TABLE_LIST*, st_join_table*) (sql_show.cc:7714)
==24999== by 0x7A77B5: get_schema_tables_result(JOIN*, enum_schema_table_state) (sql_show.cc:7815)
==24999== by 0x79F254: JOIN::prepare_result(List<Item>**) (sql_select.cc:823)
==24999== by 0x756489: JOIN::exec() (sql_executor.cc:116)
==24999== by 0x79F667: mysql_execute_select(THD*, st_select_lex*, bool) (sql_select.cc:1101)
==24999== by 0x79FDC3: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, SQL_I_List<st_order>*, SQL_I_List<st_order>*, Item*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:1222)
==24999== by 0x79FF77: handle_select(THD*, select_result*, unsigned long) (sql_select.cc:111)
==24999== by 0x77449C: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5305)
==24999== by 0x777B6D: mysql_execute_command(THD*) (sql_parse.cc:2762)
==24999== by 0x77BC87: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6477)
==24999== by 0x77C81E: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1374)
==24999== by 0x77D4F4: do_command(THD*) (sql_parse.cc:1049)
==24999== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==24999==
14:35:16 UTC - mysqld got signal 11 ;
[...]
Query (1a25bce0): 2013-07-19T00:35:16 [25836] Query: SELECT @updpk:=`pk` FROM (SELECT `pk` FROM table42_int_autoinc WHERE `pk` <= 250 ORDER BY `col_int_not_null` ASC LIMIT 7) AS tbl

Roel Van de Paar (roel11) wrote :

[roel@qaserver 209184]$ cat cmd580
ps -ef | grep 'cmdrun_580' | grep -v grep | awk '{print $2}' | xargs sudo kill -9
rm -Rf /ssd/209184/cmdrun_580
mkdir /ssd/209184/cmdrun_580
cd /ssd/randgen
bash -c "set -o pipefail; perl /ssd/randgen/runall.pl --queries=100000000 --seed=2046 --duration=300 --querytimeout=60 --reporter=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm --mysqld=--log-output=none --mysqld=--sql_mode=ONLY_FULL_GROUP_BY --mysqld=--slow_query_log --mysqld=--userstat --threads=25 --basedir=
/ssd/Percona-Server-5.6.12-rc60.4-405-debug-valgrind.Linux.x86_64 --valgrind --reporter=ValgrindErrors --validator=MarkErrorLog --grammar=/tmp/976291/97.yy --gendata=/ssd/randgen/conf/percona/query_cache_strip_comments.zz --mtr-build-thread=772 --mask=49364 --vardir1=/ssd/209184/cmdrun_580 > /ssd/209184/cmdrun580.log 2>&1"

Roel Van de Paar (roel11) wrote :

yy file used

Roel Van de Paar (roel11) wrote :

From another failure;

Query (7f1030004c50): SELECT * FROM (INFORMATION_SCHEMA . `INNODB_CHANGED_PAGES` AS table1 LEFT JOIN INFORMATION_SCHEMA . `GLOBAL_TEMPORARY_TABLES` AS table2 ON ( table2 . `NAME` = table1 . `space_id` ) ) WHERE table2 . `TABLE_ROWS` < 't' LIMIT 1

tags: added: qablock
Roel Van de Paar (roel11) wrote :

Marked as critical as discussed with Laurynas - this crash is seen in many RQG runs

Roel Van de Paar (roel11) wrote :

MTR testcase

# Test for bug 1193308 (handle_fatal_signal (sig=11) in handler::clone at sql/handler.cc:2370 | crash with concurrent I_S.TEMPORARY_TABLES query and online ALTER)

--source include/have_innodb.inc
--source include/have_debug_sync.inc

--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

CREATE TABLE t1(a INT PRIMARY KEY, b INT, c INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1, 1, 1);

SET DEBUG_SYNC= 'innodb_alter_commit_after_lock_table SIGNAL alter_ready WAIT_FOR i_s_completed';
send ALTER TABLE t1 DROP COLUMN c, ALGORITHM=INPLACE;

connect (conn2,localhost,root,,);
connection conn2;

SET DEBUG_SYNC= 'now WAIT_FOR alter_ready';
SELECT COUNT(*) FROM INFORMATION_SCHEMA.GLOBAL_TEMPORARY_TABLES;
SET DEBUG_SYNC= 'now SIGNAL i_s_completed';

disconnect conn2;
connection default;
reap;

DROP TABLE t1;

The cause is handler->table == NULL, making handler::clone() fail. That's because the temp table for ALTER is created without opening it in the engine (open_table_from_share() arg db_stats = 0, because open_table_uncached() arg open_in_engine = false from mysql_alter_table()). And engine opening is what sets the handler->table pointer.

The fix is to detect this->table == NULL in clone() and return NULL, and then fixing bug 1206486.

This is 5.6-specific code. In 5.5 and 5.1 table opens without engine open can happen in some other code paths, which warrant the same fix there, but perhaps without the testcases.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers