Race condition between buffer pool page optimistic access and eviction | InnoDB: Failing assertion: buf_page_in_file(bpage) in buf_page_make_young

Bug #1664280 reported by Laurynas Biveinis on 2017-02-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Invalid
Undecided
Unassigned
5.7
Fix Released
High
Laurynas Biveinis

Bug Description

Buffer pool mutex split patch introduced bpage::buf_fix_count-related race condition between accessing and flushing. This shows up as a crash:

[18/12/2016, 16:42:53] Alexey Stroganov: 2016-12-18 01:03:47 0x7f95a0617700 InnoDB: Assertion failure in thread 140280617596672 in file buf0buf.cc line 3514
InnoDB: Failing assertion: buf_page_in_file(bpage)
[18/12/2016, 16:43:19] Alexey Stroganov: /data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z18ut_print_timestampP8_IO_FILE+0x0)[0x7c212a]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z19buf_page_make_youngP10buf_page_t+0x95)[0x1177045]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0x975)[0x117c1f5]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld[0x100a9c7]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z29ibuf_merge_or_delete_for_pageP11buf_block_tRK9page_id_tPK11page_size_tm+0x84d)[0x1013e6d]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z20buf_page_io_completeP10buf_page_tb+0x6ac)[0x11791ec]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld[0x119c1ca]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z13buf_read_pageRK9page_id_tRK11page_size_tP5trx_t+0x45)[0x119d7b5]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0xf20)[0x117c7a0]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z27btr_cur_search_to_nth_levelP12dict_index_tmPK8dtuple_t15page_cur_mode_tmP9btr_cur_tmPKcmP5mtr_t+0x840)[0x115d3e0]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0x2b9e)[0x10c965e]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x330)[0xfd3610]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN7handler18index_read_idx_mapEPhjPKhm16ha_rkey_function+0x78)[0x829048]
/data/opt/alexey.s/bin64_57.nick.base/bin/mysqld(_ZN7handler21ha_index_read_idx_mapEPhjPKhm16ha_rkey_function+0x18d)[0x82b9dd]

Download full text (11.4 KiB)

With debug code:

Thread 73 (Thread 0x7f5b4f575700 (LWP 24068)):
#0 0x00007f5b8d157b0e in determine_info (symbolp=0x0, mapp=0x7f5b4f570e68, info=0x7f5b4f570e90, match=0x7f5b8ed68168, addr=7923769) at dl-addr.c:63
#1 __GI__dl_addr (address=0x78e839 <handle_fatal_signal(int)+1161>, info=info@entry=0x7f5b4f570e90, mapp=mapp@entry=0x7f5b4f570e68, symbolp=symbolp@entry=0x0) at dl-addr.c:137
#2 0x00007f5b8d12a87e in __GI___backtrace_symbols_fd (array=array@entry=0x7f5b4f570f80, size=size@entry=28, fd=2) at backtracesymsfd.c:49
#3 0x0000000000ea07de in my_print_stacktrace (stack_bottom=0x7f5b4f574e30 "", thread_stack=262144) at /data/opt/alexey.s/git/ps-lb.debug/mysys/stacktrace.c:238
#4 0x000000000078e839 in handle_fatal_signal (sig=6) at /data/opt/alexey.s/git/ps-lb.debug/sql/signal_handler.cc:151
#5 <signal handler called>
#6 0x00007f5b8d04a418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#7 0x00007f5b8d04c01a in __GI_abort () at abort.c:89
#8 0x0000000000762ae4 in ut_dbg_assertion_failed (expr=expr@entry=0x15e2003 "count != UINT32_MAX", file=file@entry=0x15dc328 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic", line=line@entry=1024) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/ut/ut0dbg.cc:67
#9 0x000000000110d450 in buf_block_unfix (bpage=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1024
#10 buf_block_unfix (block=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1036
#11 buf_block_buf_fix_dec (block=0x7f5b6547a280) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/include/buf0buf.ic:1047
#12 buf_page_optimistic_get (rw_latch=1, block=block@entry=0x7f5b6547a280, modify_clock=modify_clock@entry=3104, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=mtr@entry=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/buf/buf0buf.cc:4856
#13 0x00000000010e846c in btr_cur_optimistic_latch_leaves (block=0x7f5b6547a280, modify_clock=3104, latch_mode=latch_mode@entry=0x7f5b4f571ce8, cursor=cursor@entry=0x7f5b32af3288, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/btr/btr0cur.cc:445
#14 0x00000000010f6595 in btr_pcur_restore_position_func (latch_mode=latch_mode@entry=1, cursor=cursor@entry=0x7f5b32af3288, file=file@entry=0x15ebc50 "/data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc", line=line@entry=3675, mtr=mtr@entry=0x7f5b4f572c50) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/btr/btr0pcur.cc:290
#15 0x0000000000761561 in sel_restore_position_for_mysql (same_user_rec=same_user_rec@entry=0x7f5b4f571e30, pcur=pcur@entry=0x7f5b32af3288, moves_up=moves_up@entry=1, mtr=mtr@entry=0x7f5b4f572c50, latch_mode=1) at /data/opt/alexey.s/git/ps-lb.debug/storage/innobase/row/row0sel.cc:3675
#16 0x0000000001039963 in row_search_mvcc (buf=buf@entry=0x7f5b32abc430 "", mode=PAGE_CUR_GE, mode@entry=PAGE_CUR_UNSUPP, prebuilt=0x7f5b32af3088, match_mode=match_...

tags: added: bp-split

Triggering execution history:

Thread 1 calls buf_page_optimistic_get with some block pointer
Thread 2 (flushes and) evicts the same block, places it on the free list
Thread 3 calls buf_page_read, buf_page_read_low, buf_page_init_for_read, buf_page_init
Thread 3 sets block->state to BUF_BLOCK_FILE_PAGE
Thread 1 locks block mutex
Thread 1 checks block->state == BUF_BLOCK_FILE_PAGE
Thread 1 increments block.page->buf_fix_count
Thread 3 calls buf_page_init_low
Thread 3 resets buf_fix_count to zero, causing one of the possible assert failures.

It is likely the same bug as "23067038" in MySQL 8.0.0, fixed by the following commit there:

commit cbc524750c486e9fde8266439979476964917bbb
Author: Shaohua Wang <email address hidden>
Date: Wed Jun 8 10:29:10 2016 +0800

    BUG#23067038 ASSERTION FAILURE: BUF0BUF.CC:2861:BUF_PAGE_IN_FILE(BPAGE)
                 LEADS TO CORRUPT DATA

    It's a regression of wl#8423 InnoDB: Split the buffer pool mutex.

    There is a race: Thread 1, we set buf_fix_count to 0 in buf_page_init().
    Thread 2, we decrease buf_fix_count to 0xffffffff in buf_page_optimistic_get().
    Thread 2, we increase buf_fix_count to 0 in buf_page_get_gen(). Thread 3,
    we evict the page from LRU list. Thread 2, assert fails: buf_page_in_file().

    The root cause is missing block mutex protection for buf_page_init().

    Reviewed-by: Debarun Banerjee <email address hidden>
    RB: 12456

summary: - Race condition between accessing and flushing buffer pool page |
+ Race condition between buffer pool page optimistic access and eviction |
InnoDB: Failing assertion: buf_page_in_file(bpage) in
buf_page_make_young
tags: added: regression

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1060

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers