InnoDB: Failing assertion: bitmap_files->files[0].seq_num == first_file_seq_num in file line 1423 | abort in log_online_setup_bitmap_file_range

Bug #1191580 reported by Roel Van de Paar on 2013-06-16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to
Fix Released
Laurynas Biveinis
Fix Released
Laurynas Biveinis
Fix Released
Laurynas Biveinis
Fix Released
Laurynas Biveinis

Bug Description

Thread 22 (LWP 19785):
#0 0x0000003fde20c69c in pthread_kill () from /lib64/
#1 0x00000000006e711d in handle_fatal_signal (sig=6) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#2 <signal handler called>
#3 0x0000003fdde328a5 in raise () from /lib64/
#4 0x0000003fdde34085 in abort () from /lib64/
#5 0x0000000000a33032 in log_online_setup_bitmap_file_range (bitmap_files=0x1a185da8, range_start=0, range_end=18446744073709551615) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/log/
#6 0x0000000000a338de in log_online_bitmap_iterator_init (i=0x1a185da0, min_lsn=<optimized out>, max_lsn=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/log/
#7 0x00000000009ff958 in i_s_innodb_changed_pages_fill (thd=0x1ff69950, tables=<optimized out>, cond=0x0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/innobase/handler/
#8 0x000000000079af4d in do_fill_table (thd=0x1ff69950, table_list=0x1944a080, join_table=0x1944b8d8) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#9 0x000000000079b236 in get_schema_tables_result (join=0x1944ac20, executed_place=PROCESSED_BY_JOIN_EXEC) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#10 0x0000000000793135 in JOIN::prepare_result (this=0x1944ac20, columns_list=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#11 0x000000000074b6aa in JOIN::exec (this=0x1944ac20) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#12 0x0000000000793548 in mysql_execute_select (thd=0x1ff69950, select_lex=0x1ff6c418, free_join=true) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#13 0x0000000000793ca4 in mysql_select (thd=0x1ff69950, tables=0x1944a080, wild_num=0, fields=..., conds=<optimized out>, order=0x1ff6c5e0, group=0x1ff6c518, having=0x0, select_options=2684619520, result=0x1944abf8, unit=0x1ff6bdd0, select_lex=0x1ff6c418) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#14 0x0000000000793e78 in handle_select (thd=0x1ff69950, result=0x1944abf8, setup_tables_done_option=0) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#15 0x0000000000768dad in execute_sqlcom_select (thd=0x1ff69950, all_tables=0x1944a080) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#16 0x000000000076b307 in mysql_execute_command (thd=0x1ff69a38) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#17 0x0000000000770408 in mysql_parse (thd=0x1ff69950, rawbuf=0x1ff69958 "\240\244?\001", length=<optimized out>, parser_state=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#18 0x0000000000770ede in dispatch_command (command=<optimized out>, thd=0x1ff69950, packet=<optimized out>, packet_length=423927116) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#19 0x0000000000771b60 in do_command (thd=0x1ff69950) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#20 0x000000000073c48f in do_handle_one_connection (thd_arg=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#21 0x000000000073c550 in handle_one_connection (arg=<optimized out>) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/sql/
#22 0x00000000009c8e59 in pfs_spawn_thread (arg=0x1fd35270) at /bzr/5.6_val/Percona-Server-5.6.11-alpha60.3/storage/perfschema/
#23 0x0000003fde207851 in start_thread () from /lib64/
#24 0x0000003fddee890d in clone () from /lib64/

Related branches

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

Ignore the Valgrind error, this is dealt with in bug 1191582

Will probably push a diagnostic branch for this one.

tags: added: xtradb
Roel Van de Paar (roel11) wrote :

Ack. Alexey; scripts to try and reproduce from original trials which produces this:

[roel@qaserver 980213]$ cat cmd50
ps -ef | grep 'cmdrun_50' | grep -v grep | awk '{print $2}' | xargs sudo kill -9
rm -Rf /ssd/980213/cmdrun_50
mkdir /ssd/980213/cmdrun_50
cd /ssd/randgen
bash -c "set -o pipefail; perl --queries=100000000 --seed=13641 --duration=120 --querytimeout=60 --short_column_names --reporter=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm --mysqld=--log-output=none --mysqld=--sql_mode=ONLY_FULL_GROUP_BY --grammar=conf/percona_qa/5.6/5.6.yy --gendata=conf/percona_qa/5.6/5.6.zz2 --basedir=/ssd/Percona-Server-5.6.11-alpha60.3-372-debug-valgrind.Linux.x86_64 --valgrind --reporter=ValgrindErrors --validator=MarkErrorLog --threads=25 --no-mask --views --notnull --validator=Transformer --mysqld=--innodb_track_changed_pages=1 --mysqld=--innodb_max_bitmap_file_size=20480 --mysqld=--innodb_changed_pages=ON --mysqld=--innodb_max_changed_pages=0 --mysqld=--innodb_log_file_size=1048576 --mysqld=--innodb_log_files_in_group=10 --mysqld=--innodb_log_buffer_size=10485761 --mysqld=--innodb_file_per_table=1 --mysqld=--innodb_file_format=barracuda --mtr-build-thread=714 --mask=23476 --vardir1=/ssd/980213/cmdrun_50 > /ssd/980213/cmdrun50.log 2>&1"
[roel@qaserver 980213]$ cat cmd501
ps -ef | grep 'cmdrun_501' | grep -v grep | awk '{print $2}' | xargs sudo kill -9
rm -Rf /ssd/980213/cmdrun_501
mkdir /ssd/980213/cmdrun_501
cd /ssd/randgen
bash -c "set -o pipefail; perl --queries=100000000 --seed=9892 --duration=120 --querytimeout=60 --short_column_names --reporter=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm --mysqld=--log-output=none --mysqld=--sql_mode=ONLY_FULL_GROUP_BY --grammar=conf/percona_qa/5.6/5.6.yy --gendata=conf/percona_qa/5.6/5.6.zz2 --basedir=/ssd/Percona-Server-5.6.11-alpha60.3-372-debug-valgrind.Linux.x86_64 --valgrind --reporter=ValgrindErrors --validator=MarkErrorLog --threads=25 --no-mask --views --mysqld=--innodb_track_changed_pages=1 --mysqld=--innodb_max_bitmap_file_size=20480 --mysqld=--innodb_changed_pages=ON --mysqld=--innodb_max_changed_pages=0 --mysqld=--innodb_log_file_size=1048576 --mysqld=--innodb_log_files_in_group=10 --mysqld=--innodb_log_buffer_size=10485761 --mysqld=--innodb_flush_method=O_DSYNC --mtr-build-thread=792 --mask=16326 --vardir1=/ssd/980213/cmdrun_501 > /ssd/980213/cmdrun501.log 2>&1"

tags: added: 56qual
Roel Van de Paar (roel11) wrote :

Seen in 5.6-QA tree @ 18/9/13 also.

Roel Van de Paar (roel11) wrote :

It's a blocker. 20%+ of trials fails.

tags: added: qablock
Roel Van de Paar (roel11) wrote :

PS @ Percona-Server-5.6.13-rc60.5-440-debug.Linux.x86_64 (5.6-QA)
RQG @ 911 <email address hidden>

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

This is caused by a race condition between the two bitmap file enumeration passes in log_online_setup_bitmap_file_range(). If the 1st bitmap file in the range is deleted at that point, then the debug build will crash as reported on this bug. If the 2nd or subsequent bitmap file is deleted, the debug build will crash at ut_ad(bitmap_files->files[i].seq_num > bitmap_files->files[i - 1].seq_num) a few lines below.

For release builds this is not an issue as missing bitmap files will be detected and diagnosed on their open attempts, as fixed by bug 1179974. The current bug shows that the fix for the latter was incomplete.

This code is shared with XtraBackup thus at least theoretically (haven't checked) the issue might appear there as well.

Err, in the above s/(haven't checked)/(haven't checked whether the debug build actually crashes)

I had completely forgotten about bug 1204075, which is exact same issue and was fixed in XtraBackup first. Funnily when I checked XB sources five days ago, I forgot to update them first too.

Thus now there are two different fixes for this issue in the source shared between XB and PS, one merged to XB in 1204075, and one proposed for PS here. I am not sure how to best handle this situation. The current fix seems to be better and suitable for XB eventually as well. Thus if nobody objects I will remove XB from this bug and PS from bug 1204075.

no longer affects: percona-xtrabackup
Roel Van de Paar (roel11) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to:

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers