Race condition between buffer pool page optimistic access and eviction | InnoDB: Failing assertion: buf_page_in_file(bpage) in buf_page_make_young

Bug #1664280 reported by Laurynas Biveinis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Invalid
Undecided
Unassigned
5.7
Fix Released
High
Laurynas Biveinis

Bug Description

Buffer pool mutex split patch introduced bpage::buf_fix_count-related race condition between accessing and flushing. This shows up as a crash:

[18/12/2016, 16:42:53] Alexey Stroganov: 2016-12-18 01:03:47 0x7f95a0617700 InnoDB: Assertion failure in thread 140280617596672 in file buf0buf.cc line 3514
InnoDB: Failing assertion: buf_page_in_file(bpage)
[18/12/2016, 16:43:19] Alexey Stroganov: /data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z18ut_print_timestampP8_IO_FILE+0x0)[0x7c212a]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z19buf_page_make_youngP10buf_page_t+0x95)[0x1177045]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0x975)[0x117c1f5]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld[0x100a9c7]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z29ibuf_merge_or_delete_for_pageP11buf_block_tRK9page_id_tPK11page_size_tm+0x84d)[0x1013e6d]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z20buf_page_io_completeP10buf_page_tb+0x6ac)[0x11791ec]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld[0x119c1ca]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z13buf_read_pageRK9page_id_tRK11page_size_tP5trx_t+0x45)[0x119d7b5]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0xf20)[0x117c7a0]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z27btr_cur_search_to_nth_levelP12dict_index_tmPK8dtuple_t15page_cur_mode_tmP9btr_cur_tmPKcmP5mtr_t+0x840)[0x115d3e0]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0x2b9e)[0x10c965e]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x330)[0xfd3610]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN7handler18index_read_idx_mapEPhjPKhm16ha_rkey_function+0x78)[0x829048]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN7handler21ha_index_read_idx_mapEPhjPKhm16ha_rkey_function+0x18d)[0x82b9dd]

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Download full text (11.4 KiB)

With debug code:

Thread 73 (Thread 0x7f5b4f575700 (LWP 24068)):
#0 0x00007f5b8d157b0e in determine_info (symbolp=0x0, mapp=0x7f5b4f570e68, info=0x7f5b4f570e90, match=0x7f5b8ed68168, addr=7923769) at dl-addr.c:63
#1 __GI__dl_addr (address=0x78e839 <handle_fatal_signal(int)+1161>, info=info@entry=0x7f5b4f570e90, mapp=mapp@entry=0x7f5b4f570e68, symbolp=symbolp@entry=0x0) at dl-addr.c:137
#2 0x00007f5b8d12a87e in __GI___backtrace_symbols_fd (array=array@entry=0x7f5b4f570f80, size=size@entry=28, fd=2) at backtracesymsfd.c:49
#3 0x0000000000ea07de in my_print_stacktrace (stack_bottom=0x7f5b4f574e30 "", thread_stack=262144) at /data/opt/alexey.s/git/ps-lb.debug/mysys/stacktrace.c:238
#4 0x000000000078e839 in handle_fatal_signal (sig=6) at /data/opt/alexey.s/git/ps-lb.debug/sql/signal_handler.cc:151
#5 <signal handler called>
#6 0x00007f5b8d04a418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#7 0x00007f5b8d04c01a in __GI_abort () at abort.c:89
#8 0x0000000000762ae4 in ut_dbg_assertion_failed (expr=expr@entry=0x15e2003 "count != UINT32_MAX", file=file@entry=0x15dc328 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic", line=line@entry=1024) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/ut/ut0dbg.cc:67
#9 0x000000000110d450 in buf_block_unfix (bpage=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1024
#10 buf_block_unfix (block=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1036
#11 buf_block_buf_fix_dec (block=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1047
#12 buf_page_optimistic_get (rw_latch=1, block=block@entry=0x7f5b6547a280, modify_clock=modify_clock@entry=3104, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=mtr@entry=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/buf/buf0buf.cc:4856
#13 0x00000000010e846c in btr_cur_optimistic_latch_leaves (block=0x7f5b6547a280, modify_clock=3104, latch_mode=latch_mode@entry=0x7f5b4f571ce8, cursor=cursor@entry=0x7f5b32af3288, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/btr/btr0cur.cc:445
#14 0x00000000010f6595 in btr_pcur_restore_position_func (latch_mode=latch_mode@entry=1, cursor=cursor@entry=0x7f5b32af3288, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=mtr@entry=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/btr/btr0pcur.cc:290
#15 0x0000000000761561 in sel_restore_position_for_mysql (same_user_rec=same_user_rec@entry=0x7f5b4f571e30, pcur=pcur@entry=0x7f5b32af3288, moves_up=moves_up@entry=1, mtr=mtr@entry=0x7f5b4f572c50, latch_mode=1) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc:3675
#16 0x0000000001039963 in row_search_mvcc (buf=buf@entry=0x7f5b32abc430 "", mode=PAGE_CUR_GE, mode@entry=PAGE_CUR_UNSUPP, prebuilt=0x7f5b32af3088, match_mode=match_...

tags: added: bp-split
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Triggering execution history:

Thread 1 calls buf_page_optimistic_get with some block pointer
Thread 2 (flushes and) evicts the same block, places it on the free list
Thread 3 calls buf_page_read, buf_page_read_low, buf_page_init_for_read, buf_page_init
Thread 3 sets block->state to BUF_BLOCK_FILE_PAGE
Thread 1 locks block mutex
Thread 1 checks block->state == BUF_BLOCK_FILE_PAGE
Thread 1 increments block.page->buf_fix_count
Thread 3 calls buf_page_init_low
Thread 3 resets buf_fix_count to zero, causing one of the possible assert failures.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

It is likely the same bug as "23067038" in MySQL 8.0.0, fixed by the following commit there:

commit cbc524750c486e9fde8266439979476964917bbb
Author: Shaohua Wang <email address hidden>
Date: Wed Jun 8 10:29:10 2016 +0800

    BUG#23067038 ASSERTION FAILURE: BUF0BUF.CC:2861:BUF_PAGE_IN_FILE(BPAGE)
                 LEADS TO CORRUPT DATA

    It's a regression of wl#8423 InnoDB: Split the buffer pool mutex.

    There is a race: Thread 1, we set buf_fix_count to 0 in buf_page_init().
    Thread 2, we decrease buf_fix_count to 0xffffffff in buf_page_optimistic_get().
    Thread 2, we increase buf_fix_count to 0 in buf_page_get_gen(). Thread 3,
    we evict the page from LRU list. Thread 2, assert fails: buf_page_in_file().

    The root cause is missing block mutex protection for buf_page_init().

    Reviewed-by: Debarun Banerjee <email address hidden>
    RB: 12456

summary: - Race condition between accessing and flushing buffer pool page |
+ Race condition between buffer pool page optimistic access and eviction |
InnoDB: Failing assertion: buf_page_in_file(bpage) in
buf_page_make_young
tags: added: regression
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1060

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.