InnoDB: Assertion failure in file buf0flu.cc line 546 | crashes if RW workload and InnoDB compression are combined

Bug #1305364 reported by Sergei Turchanov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Laurynas Biveinis
5.1
Invalid
Undecided
Laurynas Biveinis
5.5
Fix Released
High
Laurynas Biveinis
5.6
Fix Released
High
Laurynas Biveinis

Bug Description

Crash happens regularily (3 times already) in Percona Server 56-5.6.16-rel64.0.el6.x86_64

2014-03-24 21:50:57 7f3322ffd700 InnoDB: Assertion failure in thread 139857607251712 in file buf0flu.cc line 546
InnoDB: Failing assertion: buf_page_in_file(bpage)
InnoDB: We intentionally generate a memory trap.
...
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8ca635]
/usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x645024]
/lib64/libpthread.so.0(+0xf500)[0x7f3381acf500]
/lib64/libc.so.6(gsignal+0x35)[0x7f33801848e5]
/lib64/libc.so.6(abort+0x175)[0x7f33801860c5]
/usr/sbin/mysqld[0xa4db52]
/usr/sbin/mysqld[0xa5075c]
/usr/sbin/mysqld[0xa515fc]
/usr/sbin/mysqld[0xa52b2b]
/usr/sbin/mysqld[0xa53819]
/lib64/libpthread.so.0(+0x7851)[0x7f3381ac7851]
/lib64/libc.so.6(clone+0x6d)[0x7f338023a94d]

decoding trace gives:
0xa4db52 buf_flush_ready_for_flush(buf_page_t*, buf_flush_t) + 82
0xa5075c buf_flush_page_and_try_neighbors(buf_page_t*, buf_flush_t, unsigned long, unsigned long*) + 124
0xa515fc buf_flush_batch(buf_pool_t*, buf_flush_t, unsigned long, unsigned long, bool, flush_counters_t*) + 1228
0xa52b2b buf_flush_list(unsigned long, unsigned long, unsigned long*) + 859
0xa53819 buf_flush_page_cleaner_thread + 2361

Stack traces are identical for all of the 3 craches.

Database uses compressed (Barracuda) InnoDB tables.

my.cnf options relevant to InnoDB:
innodb_empty_free_list_algorithm = legacy
innodb_checksum_algorithm = INNODB
innodb_buffer_pool_size = 1024M
innodb_log_buffer_size = 262144
innodb_flush_log_at_trx_commit = 2
innodb_file_per_table
innodb_data_file_path=ibdata1:10M:autoextend
innodb_log_files_in_group=2
innodb_file_format = 'Barracuda'

Related branches

tags: added: xtradb
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

This is the same underlying issue as bug 1227581 and bug 1269352.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

But both bugs should be fixed in 5.6.16-rel64.0. How can the same underlying issue (or, even the same instance, as commented in other bugs) still exist in that release?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

This is the same underlying issue, which was not completely fixed in the other two bugs. There are several places that needed replacing the buf_page_in_file check with buf_page_in_file || BUF_BLOCK_REMOVE_HASH one, and apparently at least one more of them was missed in the previous two bug fixes.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Note to self what needs fixing (both potentially serious issues and merely better asserts) in 5.5:
    - buf_flush_or_remove_pages: add mutex_own LRU list mutex
    - buf_flush_page_and_try_neighbors: BUF_BLOCK_REMOVE_HASH => LRU
      mutex held
    - buf_LRU_free_from_common_LRU_list must have an LRU mutex, check
      its caller too
    - buf_page_get_zip must check for BUF_BLOCK_REMOVE_HASH before
      calling buf_LRU_free_block
    - buf_pool_watch_is_sentinel: assert that either page mutex is
      locked, or that page hash is S or X latched.
    - buf_read_ahead_random must latch page hash.
    - buf_page_get_block: assert page hash latched.

tags: added: bp-split
summary: - InnoDB: Assertion failure in file buf0flu.cc line 546
+ InnoDB: Assertion failure in file buf0flu.cc line 546 | crashes if RW
+ workload and InnoDB compression are combined
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Likewise for 5.6:
    - btr_search_validate_one_table: pull block mutex lock above
      buf_block_get_state check
    - buf_flush_page_and_try_neighbors: BUF_BLOCK_REMOVE_HASH => LRU
      mutex !held
    - buf_flush_or_remove_pages: add mutex_own LRU list mutex
    - (the reported crash) buf_flush_ready_for_flush: allow
      BUF_BLOCK_REMOVE_HASH pages if BUF_FLUSH_LIST

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-781

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.