Percona Server with XtraDB

InnoDB: Failing assertion: trx_sys->descr_n_used <= (trx_sys->rw_trx_list).count in file trx0trx.cc line 1492

Reported by Roel Van de Paar on 2013-07-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server
Status tracked in 5.6
5.1
Undecided
Unassigned
5.5
Undecided
Unassigned
5.6
High
Alexey Kopytov

Bug Description

2013-07-12 17:13:52 26543 [Note] /ssd/Percona-Server-5.6.12-rc60.4-405-debug.Linux.x86_64/bin/mysqld: ready for connections.
Version: '5.6.12-rc60.4-debug-log' socket: '/ssd/290300/current1_4/tmp/master.sock' port: 13060 Percona Server with XtraDB (GPL), Release rc60.4, Revision 405-debug
2013-07-12 17:14:00 7fa70b8b6700 InnoDB: Assertion failure in thread 140355429951232 in file trx0trx.cc line 1492
InnoDB: Failing assertion: trx_sys->descr_n_used <= (trx_sys->rw_trx_list).count
InnoDB: We intentionally generate a memory trap.

Thread 1 (LWP 26671):
+bt
#0 0x000000340cc0c69c in pthread_kill () from /lib64/libpthread.so.0
#1 0x0000000000686bde in handle_fatal_signal (sig=6) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/signal_handler.cc:251
#2 <signal handler called>
#3 0x000000340c4328a5 in raise () from /lib64/libc.so.6
#4 0x000000340c434085 in abort () from /lib64/libc.so.6
#5 0x0000000000b2ca60 in trx_commit_in_memory (trx=<optimized out>, lsn=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/trx/trx0trx.cc:1492
#6 0x0000000000b2e278 in trx_commit_low (trx=0x7fa6c805af78, mtr=0x0) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/trx/trx0trx.cc:1567
#7 0x0000000000b2e616 in trx_commit (trx=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/trx/trx0trx.cc:1588
#8 0x0000000000b2e6ab in trx_commit_for_mysql (trx=0x7fa6c805af78) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/trx/trx0trx.cc:1816
#9 0x0000000000bb6f1a in dict_stats_fetch_from_ps (table=0x7fa6c801a128) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/dict/dict0stats.cc:2842
#10 dict_stats_update (table=0x7fa6fc051268, stats_upd_option=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/dict/dict0stats.cc:3041
#11 0x0000000000a0c5b5 in dict_stats_init (table=0x7fa6fc051268) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/include/dict0stats.ic:174
#12 ha_innobase::open (this=0x7fa6c800f520, name=0x7fa6c80167a0 "./test/M", mode=<optimized out>, test_if_locked=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/innobase/handler/ha_innodb.cc:5124
#13 0x00000000005b76fa in handler::ha_open (this=0x7fa6c800f520, table_arg=<optimized out>, name=0x7fa6c80167a0 "./test/M", mode=2, test_if_locked=2) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/handler.cc:2478
#14 0x00000000007c24cc in open_table_from_share (thd=0x2638720, share=<optimized out>, alias=<optimized out>, db_stat=39, prgflag=<optimized out>, ha_open_flags=0, outparam=0x7fa6c8018eb0, is_create_table=false) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/table.cc:2318
#15 0x00000000006c6fb4 in open_table (thd=0x2638720, table_list=0x7fa6c8005900, ot_ctx=0x7fa70b8b41e0) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_base.cc:3082
#16 0x00000000006ca9ec in open_and_process_table (ot_ctx=0x7fa70b8b41e0, has_prelocking_list=false, prelocking_strategy=0x7fa70b8b4300, flags=0, counter=0x263aba0, tables=0x7fa6c8005900, lex=0x263aae8, thd=0x2638720) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_base.cc:4567
#17 open_tables (thd=0x2638720, start=0x7fa70b8b42c8, counter=0x263aba0, flags=0, prelocking_strategy=0x7fa70b8b4300) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_base.cc:5069
#18 0x00000000006cb6f9 in open_normal_and_derived_tables (thd=<optimized out>, tables=0x7fa6c8005340, flags=0) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_base.cc:5768
#19 0x000000000071eb4d in execute_sqlcom_select (thd=0x2638720, all_tables=0x7fa6c8005340) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_parse.cc:5278
#20 0x00000000007217b7 in mysql_execute_command (thd=0x2638720) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_parse.cc:2762
#21 0x0000000000727818 in mysql_parse (thd=0x2638720, rawbuf=0x263aae8 "\360R7\001", length=<optimized out>, parser_state=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_parse.cc:6477
#22 0x000000000072871f in dispatch_command (command=COM_QUERY, thd=0x2638720, packet=0x263c6b1 "/* GenTest::Transform::LimitDecrease */ SELECT table1 . `col_varchar_10_key` AS field1 FROM I AS table1 LEFT JOIN M AS table2 ON table1 . `pk` = table2 . `col_int_key` WHERE table1 . `col_int_key` > 4 AND table1 . `col_int_key` <= ( 4 + 6 ) AND table1 . `col_int_key` > 4 AND table1 . `col_int_key` <= ( 2 + 2 ) OR table2 . `col_int_key` <> 199 OR table2 . `col_int_key` > 4 AND table2 . `col_int_key` < ( 9 + 4 ) OR table2 . `col_int_key` IS NULL OR table2 . `col_int_key` >= 0 AND table2 . `col_int_key` <= ( 4 + 5 ) LIMIT 1 /* TRANSFORM_OUTCOME_SINGLE_ROW */", packet_length=573) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_parse.cc:1374
#23 0x0000000000729a47 in do_command (thd=0x2638720) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_parse.cc:1049
#24 0x00000000006ed63f in do_handle_one_connection (thd_arg=<optimized out>) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_connect.cc:1615
#25 0x00000000006ed727 in handle_one_connection (arg=0x2638720) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/sql/sql_connect.cc:1526
#26 0x00000000009dd6c9 in pfs_spawn_thread (arg=0x25f5a90) at /bzr/5.6_dbg/Percona-Server-5.6.12-rc60.4/storage/perfschema/pfs.cc:1855
#27 0x000000340cc07851 in start_thread () from /lib64/libpthread.so.0
#28 0x000000340c4e890d in clone () from /lib64/libc.so.6

Tags: qa Edit Tag help

Related branches

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

[roel@qaserver 290300]$ cat cmd240
ps -ef | grep 'cmdrun_240' | grep -v grep | awk '{print $2}' | xargs sudo kill -9
rm -Rf /ssd/290300/cmdrun_240
mkdir /ssd/290300/cmdrun_240
cd /ssd/randgen
bash -c "set -o pipefail; perl /ssd/randgen/runall.pl --queries=100000000 --seed=190 --duration=300 --querytimeout=60 --reporter=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm --mysqld=--log-output=none --mysqld=--sql_mode=ONLY_FULL_GROUP_BY --mysqld=--slow_query_log --mysqld=--userstat --views --validator=Transformer --threads=25 --mysqld=--innodb_file_per_table=1 --basedir=/ssd/Percona-Server-5.6.12-rc60.4-405-debug.Linux.x86_64 --gendata=/ssd/randgen/conf/drizzle/outer_join_drizzle.zz --grammar=/tmp/416647/92.yy --mtr-build-thread=708 --mask=4769 --vardir1=/ssd/290300/cmdrun_240 > /ssd/290300/cmdrun240.log 2>&1"

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

[roel@qaserver randgen]$ bzr version-info
revision-id: <email address hidden>
date: 2013-07-01 15:03:36 -0700
build-date: 2013-07-13 08:44:40 +1000
revno: 867
branch-nick: randgen
[roel@qaserver randgen]$ cd /bzr/5.6
[roel@qaserver 5.6]$ bzr version-info
revision-id: <email address hidden>
date: 2013-07-12 09:09:16 +0000
build-date: 2013-07-13 08:44:52 +1000
revno: 405
branch-nick: 5.6

Alexey Kopytov (akopytov) wrote :

I tried to find the root cause of this bug by code analysis, but to no avail. All possible code paths involving trx descriptors management seem to comply with the invariant enforced by the failing assertion.

I also tried reproducing it locally using the randgen info posted here, but randgen is failing with STATUS_PERL_FAILURE after a few minutes with lots of obscure messages about a syntax error.

How reproducible is this bug? Can I get a core file to debug?

tags: added: 56qual
Roel Van de Paar (roel11) wrote :

Alexey, thanks! Further work;

1. I tried reproducing this bug on recent binaries, without success
2. I verified recent logs for "descr_n_used" text, none found
3. Bug seems to be low frequency occurrence to start with

Removing 56qual tag, and marking as incomplete ftm, until seen again (if at all).

One small improvement is still possible; we could have try to reproduce the original environment as closely as possible, then gradually change one step at the time to see if/where bug is resoved, but this seems like unnecessary resource use given other variables.

tags: removed: 56qual

FWIW I hit this at least once in local MTR testing, on rpl_parallel testcase. The run was started as --parallel=7 on a 4-core machine. I'll see if it saved the core dump. Unfortunately this was on a feature branch, not trunk.

Unfortunately I lost the original binary. This is one is rebuilt from identical tree with identical options, and it appears to load the core dump OK. But of course the bets are still off.

Download full text (3.3 KiB)

The feature branch sources were copied to lp:~laurynas-biveinis/percona-server/bug-1200788-repro if needed.

#6 0x0000000000d599d0 in trx_commit_in_memory (trx=0x7f5c6002e7a8, lsn=1954621) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/trx/trx0trx.cc:1496
1496 ut_ad(trx_sys->descr_n_used <= UT_LIST_GET_LEN(trx_sys->rw_trx_list));
(gdb) print trx_sys->descr_n_used
$1 = 1
(gdb) print trx_sys->rw_trx_list
$2 = {count = 1, start = 0x7f5c700196f8, end = 0x7f5c700196f8}

(gdb) bt
#0 0x00007f5cb2056f2c in __pthread_kill (threadid=<optimized out>, signo=6) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:62
#1 0x0000000000adb1f5 in my_write_core (sig=6) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/mysys/stacktrace.c:422
#2 0x0000000000736961 in handle_fatal_signal (sig=6) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/sql/signal_handler.cc:251
#3 <signal handler called>
#4 0x00007f5cb149b037 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007f5cb149e698 in __GI_abort () at abort.c:90
#6 0x0000000000d599d0 in trx_commit_in_memory (trx=0x7f5c6002e7a8, lsn=1954621) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/trx/trx0trx.cc:1496
#7 0x0000000000d59e73 in trx_commit_low (trx=0x7f5c6002e7a8, mtr=0x7f5ca1b4a630) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/trx/trx0trx.cc:1566
#8 0x0000000000d59eff in trx_commit (trx=0x7f5c6002e7a8) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/trx/trx0trx.cc:1589
#9 0x0000000000d5a7eb in trx_commit_for_mysql (trx=0x7f5c6002e7a8) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/trx/trx0trx.cc:1815
#10 0x0000000000e0890f in dict_stats_exec_sql (pinfo=0x7f5c6002d328, sql=0x1130000 "ize,\n:stat_description\n);\nEND;")
    at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/dict/dict0stats.cc:298
#11 0x0000000000e0c1b8 in dict_stats_save_index_stat (index=0x7f5c6002da80, last_update=1380173996, stat_name=0x112eb07 "->magic_n == 375767", stat_value=1, sample_size=0x0,
    stat_description=0x11303e0 "le statistics for table %s: %s\n") at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/dict/dict0stats.cc:2169
#12 0x0000000000e0c65c in dict_stats_save (table_orig=0x7f5c70081708) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/dict/dict0stats.cc:2314
#13 0x0000000000e0dc3d in dict_stats_update (table=0x7f5c70081708, stats_upd_option=DICT_STATS_RECALC_PERSISTENT)
    at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/dict/dict0stats.cc:2949
#14 0x0000000000e10d02 in dict_stats_process_entry_from_recalc_pool () at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innobase/dict/dict0stats_bg.cc:302
#15 0x0000000000e10de4 in dict_stats_thread (arg=0x0) at /home/laurynas/percona/src/xtradb-thread-priority-flag/Percona-Server/storage/innob...

Read more...

Roel Van de Paar (roel11) wrote :

Seen again in 5.6.13-rc60.5-450-debug in 5.6-QA branch (custom randgen)

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
tags: added: 56qual
Roel Van de Paar (roel11) wrote :

The check

 ut_ad(trx_sys->descr_n_used <= UT_LIST_GET_LEN(trx_sys->rw_trx_list));

in trx_commit_in_memory() happens outside trx_sys->mutex protection.

tags: removed: 56qual
Alexey Kopytov (akopytov) wrote :

Indeed, it's rather a merge regression from 5.5. The invariant is supposed to be enforced whenever trx_sys->rw_trx_list is modified, i.e. when trx_sys->mutex is locked. Which is done earlier in the same function. The same assertion later (with trx_sys->mutex released) is therefore redundant. I will remove it.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers