Valgrind: free in buf_page_get_gen (Invalid read in buf_flush_batch / buf_flush_list) + free in buf_page_get_gen (Invalid read in buf_flush_page_and_try_neighbors)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Percona Server moved to https://jira.percona.com/projects/PS |
Invalid
|
High
|
Laurynas Biveinis | |
| 5.1 |
Fix Released
|
High
|
Laurynas Biveinis | |
| 5.5 |
Fix Released
|
High
|
Laurynas Biveinis | |
| 5.6 |
Invalid
|
High
|
Laurynas Biveinis |
Bug Description
121125 7:05:01 [Note] /data/ssd/
Version: '5.5.28-
==11657== Thread 16:
==11657== Invalid read of size 8
==11657== at 0x87E0C6: buf_flush_batch (buf0flu.c:1770)
==11657== by 0x87E3FF: buf_flush_list (buf0flu.c:2047)
==11657== by 0x823CC1: srv_master_thread (srv0srv.c:3692)
==11657== by 0x38E260677C: start_thread (pthread_
==11657== by 0x38E16D3C1C: clone (in /lib64/libc-2.5.so)
==11657== Address 0x152e01a8 is 56 bytes inside a block of size 152 free'd
==11657== at 0x4A072BA: free (vg_replace_
==11657== by 0x87446A: buf_page_get_gen (buf0buf.ic:886)
==11657== by 0x85ABD1: btr_free_
==11657== by 0x863723: btr_cur_
==11657== by 0x926276: row_undo_ins (row0uins.c:117)
==11657== by 0x91EC13: row_undo_step (row0undo.c:320)
==11657== by 0x913049: que_run_threads (que0que.c:1257)
==11657== by 0x839CB8: trx_general_
==11657== by 0x8077F0: row_mysql_
==11657== by 0x80892B: row_insert_
==11657== by 0x7EFA89: ha_innobase:
==11657== by 0x69B7E1: handler:
==11657== by 0x5673D0: write_record(THD*, TABLE*, st_copy_info*) (sql_insert.
==11657== by 0x56E463: mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (sql_insert.cc:956)
==11657== by 0x57EE0B: mysql_execute_
==11657== by 0x583112: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==11657==
==11657== Invalid read of size 1
==11657== at 0x87D8FB: buf_flush_
==11657== by 0x87E008: buf_flush_batch (buf0flu.c:1738)
==11657== by 0x87E3FF: buf_flush_list (buf0flu.c:2047)
==11657== by 0x823CC1: srv_master_thread (srv0srv.c:3692)
==11657== by 0x38E260677C: start_thread (pthread_
==11657== by 0x38E16D3C1C: clone (in /lib64/libc-2.5.so)
==11657== Address 0x152e017b is 11 bytes inside a block of size 152 free'd
==11657== at 0x4A072BA: free (vg_replace_
==11657== by 0x87446A: buf_page_get_gen (buf0buf.ic:886)
==11657== by 0x85ABD1: btr_free_
==11657== by 0x863723: btr_cur_
==11657== by 0x926276: row_undo_ins (row0uins.c:117)
==11657== by 0x91EC13: row_undo_step (row0undo.c:320)
==11657== by 0x913049: que_run_threads (que0que.c:1257)
==11657== by 0x839CB8: trx_general_
==11657== by 0x8077F0: row_mysql_
==11657== by 0x80892B: row_insert_
==11657== by 0x7EFA89: ha_innobase:
==11657== by 0x69B7E1: handler:
==11657== by 0x5673D0: write_record(THD*, TABLE*, st_copy_info*) (sql_insert.
==11657== by 0x56E463: mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (sql_insert.cc:956)
==11657== by 0x57EE0B: mysql_execute_
==11657== by 0x583112: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==11657==
==11657== Invalid read of size 1
[...]
[........ numerous occurrences of Invalid read of size 1 or 4 or 8 in buf_flush_
[...]
==11657==
==11657== HEAP SUMMARY:
==11657== in use at exit: 1,560 bytes in 5 blocks
==11657== total heap usage: 3,162,238 allocs, 3,162,233 frees, 32,397,098,285 bytes allocated
==11657==
==11657== LEAK SUMMARY:
==11657== definitely lost: 0 bytes in 0 blocks
==11657== indirectly lost: 0 bytes in 0 blocks
==11657== possibly lost: 0 bytes in 0 blocks
==11657== still reachable: 0 bytes in 0 blocks
==11657== suppressed: 1,560 bytes in 5 blocks
==11657==
==11657== For counts of detected and suppressed errors, rerun with: -v
==11657== ERROR SUMMARY: 12 errors from 12 contexts (suppressed: 4 from 4)
Related branches
- Alexey Kopytov (community): Resubmit on 2013-08-15
- Laurynas Biveinis: Pending requested 2013-01-23
-
Diff: 237 lines (+63/-6)5 files modifiedPercona-Server/storage/innodb_plugin/buf/buf0buf.c (+29/-3)
Percona-Server/storage/innodb_plugin/buf/buf0flu.c (+29/-3)
Percona-Server/storage/innodb_plugin/include/buf0buf.h (+3/-0)
Percona-Server/storage/innodb_plugin/include/sync0sync.h (+1/-0)
Percona-Server/storage/innodb_plugin/sync/sync0sync.c (+1/-0)
- Alexey Kopytov (community): Resubmit on 2013-08-15
- Laurynas Biveinis: Pending requested 2013-01-23
-
Diff: 297 lines (+75/-11)6 files modifiedPercona-Server/storage/innobase/buf/buf0buf.c (+30/-3)
Percona-Server/storage/innobase/buf/buf0flu.c (+36/-8)
Percona-Server/storage/innobase/handler/ha_innodb.cc (+2/-0)
Percona-Server/storage/innobase/include/buf0buf.h (+4/-0)
Percona-Server/storage/innobase/include/sync0sync.h (+2/-0)
Percona-Server/storage/innobase/sync/sync0sync.c (+1/-0)
- Laurynas Biveinis (community): Approve on 2013-08-16
- Alexey Kopytov (community): Approve on 2013-08-15
- George Ormond Lorch III: Approve (g2) on 2013-06-06
-
Diff: 515 lines (+102/-79)6 files modifiedPercona-Server/storage/innodb_plugin/buf/buf0buf.c (+24/-18)
Percona-Server/storage/innodb_plugin/buf/buf0flu.c (+56/-47)
Percona-Server/storage/innodb_plugin/buf/buf0lru.c (+11/-7)
Percona-Server/storage/innodb_plugin/include/buf0buf.h (+4/-4)
Percona-Server/storage/innodb_plugin/include/buf0flu.h (+4/-1)
Percona-Server/storage/innodb_plugin/include/buf0lru.h (+3/-2)
- Laurynas Biveinis (community): Approve on 2013-08-16
- Alexey Kopytov (community): Approve on 2013-08-15
- George Ormond Lorch III: Approve (g2) on 2013-06-06
-
Diff: 200 lines (+45/-41)2 files modifiedPercona-Server/storage/innobase/buf/buf0buf.c (+10/-5)
Percona-Server/storage/innobase/buf/buf0flu.c (+35/-36)
Roel Van de Paar (roel11) wrote : | #1 |
Roel Van de Paar (roel11) wrote : | #2 |
In Debug + Valgrind, the sever asserts with:
130101 11:04:32 InnoDB: Assertion failure in thread 395134720 in file buf0flu.c line 482
InnoDB: Failing assertion: bpage->
Just after giving errors like these:
==19340== Invalid read of size 8
==19340== at 0x8E9F6B: buf_flush_
==19340== by 0x8EC87E: buf_flush_batch (buf0flu.c:1740)
==19340== by 0x8ED41D: buf_flush_list (buf0flu.c:2047)
==19340== by 0x8734D6: srv_master_thread (srv0srv.c:3512)
==19340== by 0x3DA1807D13: start_thread (in /usr/lib64/
==19340== by 0x3DA14F168C: clone (in /usr/lib64/
==19340== Address 0x4c5da00 is 128 bytes inside a block of size 192 free'd
==19340== at 0x4A06666: free (vg_replace_
==19340== by 0x8A9543: ut_free (ut0mem.c:221)
==19340== by 0x8E1E78: buf_page_get_gen (buf0buf.ic:886)
==19340== by 0x8E386B: buf_page_create (buf0buf.c:3760)
==19340== by 0x940759: fsp_page_create (fsp0fsp.c:1546)
==19340== by 0x941899: fseg_alloc_
==19340== by 0x9426E2: fseg_alloc_
==19340== by 0x8B05FF: btr_page_alloc (btr0btr.c:993)
==19340== by 0x8C5918: btr_store_
==19340== by 0x9CA4B3: row_ins_
==19340== by 0x9CDE87: row_ins_index_entry (row0ins.c:2297)
==19340== by 0x9CEBCD: row_ins_step (row0ins.c:2373)
==19340== by 0x84014F: row_insert_
==19340== by 0x81CDDB: ha_innobase:
==19340== by 0x6BA555: handler:
==19340== by 0x585B2E: write_record(THD*, TABLE*, st_copy_info*) (sql_insert.
Backtrace:
Thread 5 (LWP 19421):
+bt
#0 0x0000003da180c60c in pthread_kill () from /lib64/
#1 0x00000000007dbe18 in my_write_core (sig=6) at /ssd/BT-
#2 0x00000000006b1a68 in handle_fatal_signal (sig=6) at /ssd/BT-
#3 <signal handler called>
#4 0x0000003da1435935 in raise () from /lib64/libc.so.6
#5 0x0000003da14370e8 in abort () from /lib64/libc.so.6
#6 0x00000000008e9fab in buf_flush_
#7 buf_flush_
#8 0x00000000008ec87f in buf_flush_
#9 buf_flush_batch (buf_pool=
Roel Van de Paar (roel11) wrote : | #3 |
Server still asserts w/ --innodb_
Roel Van de Paar (roel11) wrote : | #4 |
Server *does not* assert w/ --innodb_
Roel Van de Paar (roel11) wrote : | #5 |
Was able to assert server w/ --innodb_
Roel Van de Paar (roel11) wrote : | #6 |
Discussed w/ Laurynas. This bug looks now unrelated to bug 1083058. Laurynas doing code analysis. Roel has testcase which semi-reliably reproduces the sporadic issue. Issue is single thread reproducible, but executing same sql in many threads helps w/ reproducibility.
Ovais Tariq (ovais-tariq) wrote : | #7 |
Roel,
I see that you only tried different flushing methods by changing the parameter --innodb_
Roel Van de Paar (roel11) wrote : | #8 |
Correction on #6:
--innodb_
When testing with: --innodb_
So, bug 1083058 may still be related to this bug.
Roel Van de Paar (roel11) wrote : | #9 |
Marking as duplicate of bug 1083058
<laurynas> Roel: can you please merge https:/
Roel -
Is 5.1 affected?
I have a theory based on instrumented runs.
We need two threads, one at buf_page_get_gen and another at
buf_flush_
compressed (i.e. state = BUF_BLOCK_
1) buf_flush_
page from the flush list, releases flush_list_mutex.
2) buf_page_get_gen attempts to read the same page, acquires
zip_mutex, gets the dirty compressed page descriptor, allocates
an uncompressed page, and replaces the descriptor with the page in
the flush_list by a buf_flush_
call acquires flush_list_mutex, sets in_flush_list = FALSE; but
leaves oldest_modification alone, releases the flush_list_mutex.
Caller then invalidates the descriptor memory, releases zip_mutex,
frees its memory.
3) buf_flush_
compressed page descriptor. Assuming that the memory was not
overwritten since deallocation, the code reads oldest_modification
!= 0 and then trips on !in_flush_list.
Roel Van de Paar (roel11) wrote : | #12 |
Laurynas, is this related to Bug 934377 - InnoDB: Failing assertion: (buf_pool-
It could be. The backtrace of that bug seems to be a bit removed from the underlying cause, so hard to say ATM.
Ovais Tariq (ovais-tariq) wrote : | #14 |
Laurynas,
What about the case when compressed tablespace is not being used. I suppose dirty read from the flush_list is still done nevertheless, is that correct?
Roel Van de Paar (roel11) wrote : | #15 |
5.1 Testing has revealed no assert, nor any related Valgrind issues.
Several rounds of patch>testing have resulted in a prototype patch being available which no longer asserts, no Valgrind's'
Ovais -
Yes, that's bug 1083058. For non-compressed tables the current bug does not trigger.
Roel Van de Paar (roel11) wrote : | #17 |
Assert in buf_page_get_gen in bug 1100178
Roel Van de Paar (roel11) wrote : | #18 |
Assert now also seen in 5.1
130117 9:24:09 InnoDB: Assertion failure in thread 386553600 in file buf/buf0flu.c line 407
InnoDB: Failing assertion: bpage->
Directly after a Valgrind warning
==7461== Thread 16:
==7461== Invalid read of size 1
==7461== at 0x97724D: buf_page_get_state (buf0buf.ic:176)
==7461== by 0x9775C8: buf_page_get_mutex (buf0buf.ic:319)
==7461== by 0x977638: buf_page_
==7461== by 0x984865: buf_flush_batch (buf0flu.c:1470)
==7461== by 0x9354CB: srv_master_thread (srv0srv.c:3108)
==7461== by 0x3DA1807D13: start_thread (in /usr/lib64/
==7461== by 0x3DA14F168C: clone (in /usr/lib64/
==7461== Address 0x19ff03f8 is 8 bytes inside a block of size 192 free'd
==7461== at 0x4A06666: free (vg_replace_
==7461== by 0x9577E8: ut_free (ut0mem.c:227)
==7461== by 0x9780AE: buf_page_
==7461== by 0x97C87F: buf_page_get_gen (buf0buf.c:2079)
==7461== by 0x97EA44: buf_page_create (buf0buf.c:2972)
==7461== by 0x9B898E: fsp_page_create (fsp0fsp.c:1608)
==7461== by 0x9BBD9E: fseg_alloc_
==7461== by 0x9BBF18: fseg_alloc_
==7461== by 0x95BF31: btr_page_alloc_low (btr0btr.c:970)
==7461== by 0x95BFA0: btr_page_alloc (btr0btr.c:1005)
==7461== by 0x96C505: btr_store_
==7461== by 0x911BA7: row_ins_
==7461== by 0x911CCA: row_ins_index_entry (row0ins.c:2297)
==7461== by 0x911FF9: row_ins_
==7461== by 0x9122C3: row_ins (row0ins.c:2514)
==7461== by 0x912594: row_ins_step (row0ins.c:2624)
==7461==
==7461== Invalid read of size 1
==7461== at 0x97724D: buf_page_get_state (buf0buf.ic:176)
==7461== by 0x9775C8: buf_page_get_mutex (buf0buf.ic:319)
==7461== by 0x981F8D: buf_flush_
==7461== by 0x984881: buf_flush_batch (buf0flu.c:1476)
==7461== by 0x9354CB: srv_master_thread (srv0srv.c:3108)
==7461== by 0x3DA1807D13: start_thread (in /usr/lib64/
==7461== by 0x3DA14F168C: clone (in /usr/lib64/
==7461== Address 0x19ff03f8 is 8 bytes inside a block of size 192 free'd
==7461== at 0x4A06666: free (vg_replace_
==7461== by 0x9577E8: ut_free (ut0mem.c:227)
==7461== by 0x9780AE: buf_page_
==7461== by 0x97C87F: buf_page_get_gen (buf0buf.c:2079)
==7461== by 0x97EA44: buf_page_create (buf0buf.c:2972)
==7461== by 0x9B898E: fsp_page_create (fsp0fsp.c:1608)
==7461== by 0x9BBD9E: fseg_alloc_
==7461== by 0x9BBF18: fseg_alloc_
==7461== by 0x95BF31: btr_page_alloc_low (btr0btr.c:970)
==7461== by 0x95BFA0: btr_page_alloc (btr0btr.c:1005)
==7461== by 0x96C505: btr_store_
==7461== by 0x911BA7: row_ins_
==7461== by 0x911CCA: row_ins_index_entry (row0ins.c:2297)
==7461== by 0x911FF9: row_ins_
==7461== ...
Roel Van de Paar (roel11) wrote : | #19 |
Roel Van de Paar (roel11) wrote : | #20 |
Assert seen again in QA-16274-5.5 tree at revid Percona-
Roel Van de Paar (roel11) wrote : | #21 |
Roel Van de Paar (roel11) wrote : | #22 |
Roel Van de Paar (roel11) wrote : | #23 |
If we should split this issue into a new bug for clarity etc, please let me know and I'll be happy to arrange.
Roel -
The last error is indeed unrelated and should be split to a new bug.
Roel Van de Paar (roel11) wrote : | #25 |
Split to bug 1110102 from comment 20 up to 24.
tags: | added: xtradb |
Related: bug 1122462.
Percona now uses JIRA for bug reports so this bug report is migrated to: https:/
This bug is a split from bug 1083058