Comment 22 for bug 1189430

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :

Reproduced this with ROLLBACK heavy pure DML load against single node.

From multimaster pure DML run, helgrind errors indicate that there might be possibility or race:

==7868== Possible data race during read of size 1 at 0x1483C0A8 by thread #32
==7868== Locks held: none
==7868== at 0x85132F: rec_get_deleted_flag (mach0data.ic:54)
==7868== by 0x85AF05: row_search_for_mysql (row0sel.c:4360)
==7868== by 0x81ED5C: ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int) (ha_innodb.cc:6812)
==7868== by 0x81EEAC: ha_innobase::rnd_next(unsigned char*) (ha_innodb.cc:7006)
==7868== by 0x7A2286: rr_sequential(READ_RECORD*) (records.cc:461)
==7868== by 0x5D0E0E: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:11801)
==7868== by 0x5D9BEC: do_select(JOIN*, List<Item>*, TABLE*, Procedure*) (sql_select.cc:11562)
==7868== by 0x5EDB08: JOIN::exec() (sql_select.cc:2385)
==7868== by 0x5E7FE6: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2604)
==7868== by 0x5EE219: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:297)
==7868== by 0x59AF7C: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5040)
==7868== by 0x5A36DE: mysql_execute_command(THD*) (sql_parse.cc:2475)
==7868== by 0x5AB83A: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6266)
==7868== by 0x5ABA90: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6099)
==7868== by 0x5AD0F8: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1214)
==7868== by 0x5AF918: do_command(THD*) (sql_parse.cc:869)
==7868== by 0x658320: do_handle_one_connection(THD*) (sql_connect.cc:887)
==7868== by 0x6584DC: handle_one_connection (sql_connect.cc:799)
==7868== by 0x4C2DC3D: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==7868== by 0x5899E99: start_thread (pthread_create.c:308)
==7868== by 0x67D7CCC: clone (clone.S:112)
==7868==
==7868== This conflicts with a previous write of size 1 by thread #29
==7868== Locks held: none
==7868== at 0x86160B: row_upd_rec_in_place (mach0data.ic:41)
==7868== by 0x8BF796: btr_cur_update_in_place (btr0cur.c:1919)
==7868== by 0x8BFDBA: btr_cur_optimistic_update (btr0cur.c:2023)
==7868== by 0x9C0D38: row_undo_mod_clust_low (row0umod.c:103)
==7868== by 0x9C23C9: row_undo_mod_clust (row0umod.c:247)
==7868== by 0x9C3694: row_undo_mod (row0umod.c:891)
==7868== by 0x9B8B6A: row_undo_step (row0undo.c:306)
==7868== by 0x9A6905: que_run_threads (que0que.c:1245)

Thread #29 stack trace was not complete, but there is a high possibility that it was still doing the same rollback as reported in previous error:

==7868== Possible data race during read of size 1 at 0x14118C98 by thread #29
==7868== Locks held: none
==7868== at 0x8DD996: buf_page_get_gen (buf0buf.ic:1000)
==7868== by 0x89698C: trx_sys_update_mysql_binlog_offset (trx0sys.ic:119)
==7868== by 0x89A7B2: trx_write_serialisation_history (trx0trx.c:893)
==7868== by 0x89C056: trx_commit_off_kernel (trx0trx.c:945)
==7868== by 0x891944: trx_finish_rollback_off_kernel (trx0roll.c:1286)
==7868== by 0x9A58AE: que_thr_dec_refer_count (que0que.c:869)
==7868== by 0x9A6A1B: que_run_threads (que0que.c:1317)
==7868== by 0x892D46: trx_general_rollback_for_mysql (trx0roll.c:100)
==7868== by 0x89332A: trx_rollback_for_mysql (trx0roll.c:149)
==7868== by 0x8180B9: innobase_rollback(handlerton*, THD*, bool) (ha_innodb.cc:3009)
==7868== by 0x6CDC15: ha_rollback_trans(THD*, bool) (handler.cc:1518)
==7868== by 0x6CE337: ha_commit_trans(THD*, bool) (handler.cc:1353)
==7868== by 0x66663B: trans_commit(THD*) (transaction.cc:199)
==7868== by 0x5A7593: mysql_execute_command(THD*) (sql_parse.cc:4143)
==7868== by 0x5AB83A: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6266)
==7868== by 0x5ABA90: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6099)
==7868== by 0x5AD0F8: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1214)
==7868== by 0x5AF918: do_command(THD*) (sql_parse.cc:869)
==7868== by 0x658320: do_handle_one_connection(THD*) (sql_connect.cc:887)
==7868== by 0x6584DC: handle_one_connection (sql_connect.cc:799)
==7868== by 0x4C2DC3D: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==7868== by 0x5899E99: start_thread (pthread_create.c:308)
==7868== by 0x67D7CCC: clone (clone.S:112)

As both threads #29 and #32 were running in local mode, it is quite likely that locally executed rollback is the culprit here.