Buffer pool may fail to remove dirty pages for a particular tablespace from the flush list

Bug #1552673 reported by Laurynas Biveinis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
High
Laurynas Biveinis
5.7
Fix Released
High
Laurynas Biveinis

Bug Description

There are buffer pool flush list scans to remove dirty pages belonging to a particular tablespace. With the buffer pool mutex split, those scans might leave some dirty pages behind. This would cause a debug build crash [1], and probably something bad on release build too.

[1]

2016-03-02 17:12:38 0x7f1cb6047700 InnoDB: Assertion failure in thread 139761289557760 in file buf0lru.cc line 756
InnoDB: Failing assertion: err == DB_INTERRUPTED || buf_pool_get_dirty_pages_count(buf_pool, id, observer) == 0

Thread 1 (Thread 0x7f1cb6047700 (LWP 2502)):
#0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1 0x000000000195dad1 in my_write_core (sig=6) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/mysys/stacktrace.c:247
#2 0x0000000000ef7276 in handle_fatal_signal (sig=6) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/signal_handler.cc:223
#3 <signal handler called>
#4 0x00007f1cde2e9267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#5 0x00007f1cde2eaeca in __GI_abort () at abort.c:89
#6 0x0000000001c7a6dd in ut_dbg_assertion_failed (expr=0x23cedc8 "err == DB_INTERRUPTED || buf_pool_get_dirty_pages_count(buf_pool, id, observer) == 0", file=0x23cec30 "/mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/buf/buf0lru.cc", line=756) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/ut/ut0dbg.cc:67
#7 0x0000000001d076b3 in buf_flush_dirty_pages (buf_pool=0x3355ab8, id=12702, observer=0x0, flush=false, trx=0x0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/buf/buf0lru.cc:755
#8 0x0000000001d07bfe in buf_LRU_remove_pages (buf_pool=0x3355ab8, id=12702, buf_remove=BUF_REMOVE_FLUSH_NO_WRITE, trx=0x0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/buf/buf0lru.cc:925
#9 0x0000000001d07cb4 in buf_LRU_flush_or_remove_pages (id=12702, buf_remove=BUF_REMOVE_FLUSH_NO_WRITE, trx=0x0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/buf/buf0lru.cc:983
#10 0x0000000001d6d2a0 in fil_delete_tablespace (id=12702, buf_remove=BUF_REMOVE_FLUSH_NO_WRITE) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/fil/fil0fil.cc:2783
#11 0x0000000001b95208 in row_drop_single_table_tablespace (space_id=12702, tablename=0x7f1c5c1087e0 "test/t1#P#p0", filepath=0x7f1c5c278e38 "./test/t1#P#p0.ibd", is_temp=false, is_encrypted=false, trx=0x7f1cd6c44e40) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/row/row0mysql.cc:4199
#12 0x0000000001b965bf in row_drop_table_for_mysql (name=0x7f1c5c4e66a8 "test/t1#P#p0", trx=0x7f1cd6c44e40, drop_db=true, nonatomic=true, handler=0x0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/row/row0mysql.cc:4731
#13 0x0000000001b975df in row_drop_database_for_mysql (name=0x7f1cb6043a30 "test/t1#", trx=0x7f1cd6c44e40, found=0x7f1cb6043840) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/row/row0mysql.cc:5090
#14 0x0000000001a295ba in ha_innobase::delete_table (this=0x7f1c5c075de0, name=0x7f1cb6045020 "./test/t1") at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/innobase/handler/ha_innodb.cc:12780
#15 0x0000000000f9602c in handler::ha_delete_table (this=0x7f1c5c075de0, name=0x7f1cb6045020 "./test/t1") at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/handler.cc:4970
#16 0x0000000000f8f43f in ha_delete_table (thd=0x7f1c5c000950, table_type=0x316a860, path=0x7f1cb6045020 "./test/t1", db=0x7f1c5c075708 "test", alias=0x7f1c5c075148 "t1", generate_warning=true) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/handler.cc:2626
#17 0x000000000169cf80 in mysql_rm_table_no_locks (thd=0x7f1c5c000950, tables=0x7f1c5c075180, if_exists=false, drop_temporary=false, drop_view=false, dont_log_query=false) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_table.cc:2557
#18 0x000000000169c0cd in mysql_rm_table (thd=0x7f1c5c000950, tables=0x7f1c5c075180, if_exists=0 '\000', drop_temporary=0 '\000') at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_table.cc:2206
#19 0x0000000001609cc0 in mysql_execute_command (thd=0x7f1c5c000950, first_level=true) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_parse.cc:3707
#20 0x000000000160ff8b in mysql_parse (thd=0x7f1c5c000950, parser_state=0x7f1cb60464f0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_parse.cc:5796
#21 0x0000000001604321 in dispatch_command (thd=0x7f1c5c000950, com_data=0x7f1cb6046de0, command=COM_QUERY) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_parse.cc:1448
#22 0x00000000016031a8 in do_command (thd=0x7f1c5c000950) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/sql_parse.cc:1011
#23 0x000000000174b52b in handle_connection (arg=0x359f0b0) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/sql/conn_handler/connection_handler_per_thread.cc:313
#24 0x00000000019827af in pfs_spawn_thread (arg=0x3915610) at /mnt/workspace/mysql-5.7-param/BUILD_TYPE/debug/Host/ubuntu-wily-64bit/storage/perfschema/pfs.cc:2192
#25 0x00007f1cdef266aa in start_thread (arg=0x7f1cb6047700) at pthread_create.c:333
#26 0x00007f1cde3bae9d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

buf_flush_or_remove_pages has at least the issue, that if either buf_flush_or_remove_page or buf_flush_try_yield returns must_restart == true, the prev pointer is still followed and its target page is attempted to be processed, before exiting the loop.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

This appears to be a regression in the initial 5.6 port.

tags: added: bp-split xtradb
tags: added: regression
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Tentative 5.7 fix

diff --git a/storage/innobase/buf/buf0lru.cc b/storage/innobase/buf/buf0lru.cc
index b76607d..db63ccb 100644
--- a/storage/innobase/buf/buf0lru.cc
+++ b/storage/innobase/buf/buf0lru.cc
@@ -608,6 +608,7 @@ rescan:
       bpage != NULL;
       bpage = prev) {

+ ut_ad(!must_restart);
   ut_a(buf_page_in_file(bpage));

   /* Save the previous link because once we free the
@@ -628,9 +629,6 @@ rescan:

    /* Remove was unsuccessful, we have to try again
    by scanning the entire list from the end.
- This also means that we never released the
- flust list mutex. Therefore we can trust the prev
- pointer.
    buf_flush_or_remove_page() released the
    flush list mutex but not the LRU list mutex.
    Therefore it is possible that a new page was
@@ -647,6 +645,12 @@ rescan:
    iteration. */

    all_freed = false;
+
+ if (UNIV_UNLIKELY(must_restart)) {
+
+ /* cannot trust the prev pointer */
+ break;
+ }
   } else if (flush) {

    /* The processing was successful. And during the
@@ -654,10 +658,6 @@ rescan:
    when calling buf_page_flush(). We cannot trust
    prev pointer. */
    goto rescan;
- } else if (UNIV_UNLIKELY(must_restart)) {
-
- ut_ad(!all_freed);
- break;
   }

   ++processed;
@@ -670,6 +670,11 @@ rescan:
    /* Reset the batch size counter if we had to yield. */

    processed = 0;
+ } else if (UNIV_UNLIKELY(must_restart)) {
+
+ /* Cannot trust prev pointer now */
+ all_freed = false;
+ break;
   }

 #ifdef DBUG_OFF

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-969

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.