Killing a query inside InnoDB causes it to eventually crash with an assertion

Bug #1134757 reported by Ammon Sutherland on 2013-02-27
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Unassigned
5.1
Fix Released
High
Unassigned
5.5
Fix Released
High
Unassigned
5.6
Fix Released
High
Unassigned

Bug Description

RHEL 5.7, fully patched as of 2/27/2013.
Percona Server rhel5 rpm's for 5.1.67-rel14.3.506
Server is a slave, DB size = 2.8TB over 9600 tables 95% innodb, 5% myisam.

The following has happened twice today after upgrading from percona's 5.1.66-rel14.1.495 to 5.1.67-rel14.3.506 .

Stack trace and error message in in error log:

130227 13:16:27 InnoDB: Assertion failure in thread 1385572672 in file btr/btr0pcur.c line 245
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
19:16:27 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=4294967296
read_buffer_size=262144
max_used_connections=281
max_threads=3000
thread_count=150
connection_count=150
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 11139585 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2abf00830eb0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 529620f8 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8ab8f5]
/usr/sbin/mysqld(handle_fatal_signal+0x40b)[0x6a77cb]
/lib64/libpthread.so.0[0x31fdc0eb10]
/lib64/libc.so.6(gsignal+0x35)[0x31fd830265]
/lib64/libc.so.6(abort+0x110)[0x31fd831d10]
/usr/sbin/mysqld[0x7f4349]
/usr/sbin/mysqld[0x7a40e8]
/usr/sbin/mysqld[0x7a4912]
/usr/sbin/mysqld[0x73daae]
/usr/sbin/mysqld(_ZN7handler21read_multi_range_nextEPP18st_key_multi_range+0x2d)[0x69912d]
/usr/sbin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x41)[0x6802e1]
/usr/sbin/mysqld(_ZN26QUICK_ROR_INTERSECT_SELECT8get_nextEv+0xb9)[0x680ee9]
/usr/sbin/mysqld[0x694ec9]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x6c)[0x61f8bc]
/usr/sbin/mysqld[0x62477d]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0xc2d)[0x63624d]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x12e)[0x63216e]
/usr/sbin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x1bd)[0x6379ed]
/usr/sbin/mysqld[0x5cd08f]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x441)[0x5cebb1]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjPPKc+0x4fb)[0x5d41cb]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xde5)[0x5d53f5]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x126)[0x5d5e06]
/usr/sbin/mysqld(handle_one_connection+0x348)[0x5c9048]
/lib64/libpthread.so.0[0x31fdc0673d]
/lib64/libc.so.6(clone+0x6d)[0x31fd8d3f6d]

InnoDB Variables:

innodb_file_format=barracuda
innodb_buffer_pool_size = 72G
innodb_additional_mem_pool_size = 1G
innodb_log_buffer_size = 128M
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 16
innodb_flush_method = ALL_O_DIRECT
innodb_file_per_table = 1
innodb_sync_spin_loops=200
innodb_stats_on_metadata=0
innodb_stats_method=nulls_unequal
innodb_stats_sample_pages=32
innodb_io_capacity=5000
innodb_open_files=8000
innodb_flush_neighbor_pages=1
innodb_adaptive_flushing=1
innodb_read_io_threads=10
innodb_write_io_threads=10
innodb_autoinc_lock_mode=2
innodb_stats_update_need_lock=0
expand_fast_index_creation=1
innodb_old_blocks_time=1000
innodb_concurrency_tickets=1000
innodb_fast_checksum=1
innodb_log_block_size=4096
transaction-isolation = READ-COMMITTED

Thank you for the problem report. Please, attach part of the error log since last successful server startup and up to restart after this crash. I wonder if we will see the exact SELECT that crashed and some more details about the problem before the assertion.

Assertion happened in the btr_pcur_restore_position_func():

        if (UNIV_UNLIKELY(cursor->old_stored != BTR_PCUR_OLD_STORED)
            || UNIV_UNLIKELY(cursor->pos_state != BTR_PCUR_WAS_POSITIONED
                             && cursor->pos_state != BTR_PCUR_IS_POSITIONED)) {
                ut_print_buf(stderr, cursor, sizeof(btr_pcur_t));
                putc('\n', stderr);
                if (cursor->trx_if_known) {
                        trx_print(stderr, cursor->trx_if_known, 0);
                }

                ut_error;
        }

Each time there is a select in the logs prior to the crash, sorry to not include it before as it looked like the timestamp was wrong but that was for another line.

130227 11:30:29 [ERROR] /usr/sbin/mysqld: Sort aborted
 len 224; hex 38b473d4be2a000063c0986bb62a00006012d3efab2a0000446174654164646564203d206e6f77280300000009095748030000000000000001000000000000000000000000000000000000
000000000000000000000000002f636f72706f726174652f69332f7368617264696e672f63616c6c6465746169010000000000000083445107000000008ee6b2d9be2a000002000000000000000100000000
000000405d36b3ab2a0000030000000000000060e1117700000000020000000000000048bb4dddbe2a00002d63726f6e31303188e6b2d9be2a00001000000000000000; asc 8 s * c k * ` *
DateAdded = now( WH /corp/i/sharding/calldetai DQ * @]6 * ` w
    H M * -cron01 * ;
TRANSACTION 235B5EB68, ACTIVE 16 sec, process no 6462, OS thread id 1385572672 fetching rows, thread declared inside InnoDB 968
mysql tables in use 3, locked 0
MySQL thread id 18862, query id 13801933 10.24.160.250 rv-marketing Sending data
Select
        C.`CallDir`,
        Count(Distinct C.ContactID) as Contacts
From
        Corp.`Call` C
Where
        C.`CallDate` = '2013-02-26'
        and C.`CallDir`='Inbound'
Group by
        C.`CallDir`
 LIMIT 0, 1000

Sorry for the delay with followup. Looks like I missed your reply.

If you are still affected by this assertion failure, please, send the output of:

explain Select
        C.`CallDir`,
        Count(Distinct C.ContactID) as Contacts
From
        Corp.`Call` C
Where
        C.`CallDate` = '2013-02-26'
        and C.`CallDir`='Inbound'
Group by
        C.`CallDir`
 LIMIT 0, 1000;

use Corp
show create table Call\G
show table status like 'Call'\G

Download full text (4.7 KiB)

An additional note: This does not cause a crash in 5.1.66-rel14.1, and it does not always seem to cause a crash in 5.1.67.

EXPLAIN OUTPUT:

+----+-------------+-------+-------------+------------------------+------------------------+---------+------+-------+----------+------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------------+------------------------+------------------------+---------+------+-------+----------+------------------------------------------------------+
| 1 | SIMPLE | C | index_merge | CallDir,CallDate | CallDate,CallDir | 3,2 | NULL | 25022 | 100.00 | Using intersect(CallDate,CallDir); Using where |
+----+-------------+-------+-------------+------------------------+------------------------+---------+------+-------+----------+------------------------------------------------------+

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | select `Corp`.`C`.`CallDir` AS `CallDir`,count(distinct `Corp`.`C`.`ContactID`) AS `Contacts` from `Corp`.`Call` `C` where ((`Corp`.`C`.`CallDir` = 'Inbound') and (`Corp`.`C`.`CallDate` = '2013-02-26')) group by `Corp`.`C`.`CallDir` limit 0,1000 |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

CREATE TABLE `Call` (
  `ID` bigint(64) unsigned NOT NULL AUTO_INCREMENT,
  `CallID` varchar(18) DEFAULT NULL,
  `ConfCallID` varchar(18) DEFAULT NULL,
  `CallDir` enum('Inbound','Outbound') DEFAULT NULL,
  `CallDateTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `CallDate` date NOT NULL,
  `CallLength` int(11) NOT NULL DEFAULT '0',
  `DNIS` varchar(32) NOT NULL DEFAULT '',
  `MarketingCodeID` int(10) unsigned DEFAULT NULL,
  `ContactID` int(10) unsigned DEFAULT NULL,
  `CallerID_Number` varchar(128) NOT NULL DEFAULT '',
  `CallerID_Name` varchar(128) NOT NULL DEFAULT '...

Read more...

OK, if this query leads to crash every time, can you, please, check if it still crashes when index merge is NOT used:

set session optimizer_switch='index_merge_intersection=off';
set session optimizer_switch='index_merge=off';
<run your query>;

Ovais Tariq (ovais-tariq) wrote :
Download full text (10.4 KiB)

This bug is caused by the incorrect fix for bug 14704286. Hence this bug is a duplicate of the upstream bug http://bugs.mysql.com/bug.php?id=68051 The bug is reproducible on upstream 5.1.67 as well, reason being that the fix for bug 14704286 was silently ported to 5.1 as well and never mentioned in the release notes by Oracle http://dev.mysql.com/doc/relnotes/mysql/5.1/en/news-5-1-67.html

See the following revision info:
[root@ovaistariq-test 5.1.67]# bzr log --show-diff -r 3837
------------------------------------------------------------
revno: 3837
committer: Krunal Bauskar <email address hidden>
branch nick: mysql-5.1
timestamp: Mon 2012-10-15 09:24:33 +0530
message:

  bug#14704286
  SECONDARY INDEX UPDATES MAKE CONSISTENT READS DO O(N^2) UNDO PAGE
  LOOKUPS (honoring kill query while accessing sec_index)

  If secondary index is being used for select query evaluation and this
  query is operating with consistent read snapshot it might take good time for
  secondary index to return back control to mysql as MVCC would kick in.

  If user issues "kill query <id>" while query is actively accessing
  secondary index it will not be honored as there is no hook to check
  for this condition. Added hook for this check.

  -----
  Parallely secondary index taking too long to evaluate for consistent
  read snapshot case is being examined for performance improvement. WL#6540.
diff:
=== added file 'mysql-test/suite/innodb/r/innodb_bug14704286.result'
--- mysql-test/suite/innodb/r/innodb_bug14704286.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/r/innodb_bug14704286.result 2012-10-15 03:54:33 +0000
@@ -0,0 +1,43 @@
+use test;
+drop table if exists t1;
+Warnings:
+Note 1051 Unknown table 't1'
+create table t1 (id int primary key, value int, value2 int,
+value3 int, index(value,value2)) engine=innodb;
+insert into t1 values
+(10,10,10,10),(11,11,11,11),(12,12,12,12),(13,13,13,13),(14,14,14,14),
+(15,15,15,15),(16,16,16,16),(17,17,17,17),(18,18,18,18),(19,19,19,19),
+(20,20,20,20);
+use test;
+start transaction with consistent snapshot;
+use test;
+CREATE PROCEDURE update_t1()
+BEGIN
+DECLARE i INT DEFAULT 1;
+while (i <= 5000) DO
+update test.t1 set value2=value2+1, value3=value3+1 where id=12;
+SET i = i + 1;
+END WHILE;
+END|
+CALL update_t1();
+select * from t1;
+id value value2 value3
+10 10 10 10
+11 11 11 11
+12 12 5012 5012
+13 13 13 13
+14 14 14 14
+15 15 15 15
+16 16 16 16
+17 17 17 17
+18 18 18 18
+19 19 19 19
+20 20 20 20
+select * from t1 force index(value) where value=12;
+kill query @id;
+ERROR 70100: Query execution was interrupted
+select * from t1 where value = 12;
+id value value2 value3
+12 12 12 12
+drop procedure if exists update_t1;
+drop table if exists t1;

=== added file 'mysql-test/suite/innodb/t/innodb_bug14704286.test'
--- mysql-test/suite/innodb/t/innodb_bug14704286.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_bug14704286.test 2012-10-15 03:54:33 +0000
@@ -0,0 +1,89 @@
+--source include/have_innodb.inc
+
+#
+# create test-bed to run test
+#
+use test;
+
+drop table if exists t1;
+create table t1 (id int primary key, value int, value2 int,
+value3 int, index(value,value2)) engi...

Alexey Kopytov (akopytov) wrote :

The upstream bug is fixed in 5.1.69, 5.5.31 and 5.6.11.

summary: - MySQL crash
+ Killing a query inside InnoDB causes it to eventually crash with an
+ assertion

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-640

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.