Percona Server with XtraDB

Failing assertion: bpage->in_flush_list in file buf0lru.c line 459 and in file buf0flu.c line 522 | abort in buf_flush_remove

Reported by Roel Van de Paar on 2013-01-30
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server
High
Laurynas Biveinis
5.1
Undecided
Unassigned
5.5
High
Laurynas Biveinis

Bug Description

Split from bug 1086680:

130127 0:29:33 InnoDB: Assertion failure in thread 139981911537408 in file buf0lru.c line 459
InnoDB: Failing assertion: bpage->in_flush_list

Seen, sporadically, in QA-16274-5.5 tree at revid Percona-Server-5.5.28-rel29.3-416-debug.Linux.x86_64, this time aborting in buf_flush_try_yield. Attaching full bt set + error log.

Related branches

Roel Van de Paar (roel11) wrote :

Roel -

I have pushed a commit with extra diagnostics for this bug to the QA branch. Please update with the new stacktraces and and the error log.

Roel Van de Paar (roel11) wrote :

<laurynas> Roel: also, for one bug I pushed a diagnostic commit to the QA branch
<Roel> laurynas: overhead? (if low, I can just leave it in for the main run)
<laurynas> Roel: no overhead
<laurynas> Roel: can you start something with the current branch that reproduces https://bugs.launchpad.net/percona-server/+bug/1110102 ?
<laurynas> Roel: it would be easier before respin, then I don't have to maintain that commit
<Roel> laurynas: I see what I can do, but it needs many trials before it hits it, so it may make more sense to do it as part of the qa run2 itself which is many trials
<laurynas> Roel: I see. No problem, I will bring it over
<Roel> thanks, that helps

New occurrence in QA-16724-5.5-2 @ Percona-Server-5.5.28-rel29.3-422-debug.Linux.x86_64

130202 12:29:11 InnoDB: Assertion failure in thread 140434501179136 in file buf0flu.c line 522
InnoDB: Failing assertion: bpage->in_flush_list

Thread 1 (LWP 21553):
+bt
#0 0x0000003da180c60c in pthread_kill () from /lib64/libpthread.so.0
#1 0x00000000007dea78 in my_write_core (sig=6) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/mysys/stacktrace.c:433
#2 0x00000000006b3144 in handle_fatal_signal (sig=6) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/sql/signal_handler.cc:249
#3 <signal handler called>
#4 0x0000003da1435935 in raise () from /lib64/libc.so.6
#5 0x0000003da14370e8 in abort () from /lib64/libc.so.6
#6 0x00000000008ebad5 in buf_flush_remove (bpage=0x7fb977bddbe0) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/storage/innobase/buf/buf0flu.c:522
#7 0x00000000008ec4e8 in buf_flush_write_complete (bpage=bpage@entry=0x7fb977bddbe0) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/storage/innobase/buf/buf0flu.c:659
#8 0x00000000008e3958 in buf_page_io_complete (bpage=0x7fb977bddbe0) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/storage/innobase/buf/buf0buf.c:4186
#9 0x00000000009391d6 in fil_aio_wait (segment=segment@entry=6) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/storage/innobase/fil/fil0fil.c:5611
#10 0x0000000000878ada in io_handler_thread (arg=<optimized out>) at /ssd/QA-16724-5.5-2/Percona-Server-5.5.28-rel29.3/storage/innobase/srv/srv0start.c:485
#11 0x0000003da1807d14 in start_thread () from /lib64/libpthread.so.0
#12 0x0000003da14f168d in clone () from /lib64/libc.so.6

summary: - Failing assertion: bpage->in_flush_list in file buf0lru.c line 459
+ Failing assertion: bpage->in_flush_list in file buf0lru.c line 459 |
+ abort in buf_flush_remove
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

Run details:

[Roel@qaserver 861716]$ cat cmd602
ps -ef | grep 'cmdrun_602' | grep -v grep | awk '{print $2}' | xargs sudo kill -9
rm -Rf /ssd/861716/cmdrun_602
mkdir /ssd/861716/cmdrun_602
cd /ssd/randgen
bash -c "set -o pipefail; perl runall.pl --queries=100000000 --seed=18957 --duration=200 --querytimeout=60 --short_column_names --reporter=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm --mysqld=--log-output=none --mysqld=--sql_mode=ONLY_FULL_GROUP_BY --grammar=conf/percona_qa/percona_qa.yy --gendata=conf/percona_qa/percona_qa.zz --basedir=/ssd/Percona-Server-5.5.28-rel29.3-422-debug.Linux.x86_64 --threads=25 --views --notnull --mysqld=--innodb_track_changed_pages=1 --mysqld=--innodb_changed_pages=FORCE --mysqld=--innodb_max_changed_pages=0 --mysqld=--innodb_file_per_table=1 --mtr-build-thread=706 --mask=50689 --vardir1=/ssd/861716/cmdrun_602 > /ssd/861716/cmdrun602.log 2>&1"

Note to self: check thread 18 and thread 1 interaction in the last stacktrace. Previous diagnostic commit was for the thread 18 code path, see if the same needs to be added for buf page flush I/O completion.

Roel Van de Paar (roel11) wrote :

Seen in different location

130202 12:19:14 InnoDB: Assertion failure in thread 140328580802304 in file buf0flu.c line 522
InnoDB: Failing assertion: bpage->in_flush_list

Roel Van de Paar (roel11) wrote :
summary: - Failing assertion: bpage->in_flush_list in file buf0lru.c line 459 |
- abort in buf_flush_remove
+ Failing assertion: bpage->in_flush_list in file buf0lru.c line 459 and
+ in file buf0flu.c line 522 | abort in buf_flush_remove
Roel Van de Paar (roel11) wrote :

Roel -

The 2012-02-04 and 2012-02-05 are not this bug but are bug 934377 instead, please update accordingly. Everything involving buf0flu.c:522 is 934377 for now.

For this bug the diagnostics commit in QA-2 still applies, will be waiting for logs/stacktraces.

tags: added: xtradb

Similar cause to 934377, another regression from the upstream drop table changes.

buf_flush_try_yield() performs a dirty page I/O fix read, which may result in the tablespace-dropping thread choosing to yield on the currently selected page, which is then flushed by another thread, and is not on the flush list anymore once the first thread continues.

Also related: bug 1122462.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers