This is the answer on the question why I can't repeat the bug on 5.7.20 and why it can be repeated on the customer's environment with 5.7.20. For this analysis I use the following code which contains my additional diagnostics and some variants of the test: https://github.com/vlad-lesin/percona-server/tree/lp-1735555-xa-transaction-lock-5.7.19-logging-and-stable-test https://github.com/vlad-lesin/percona-server/tree/lp-1735555-xa-transaction-lock-5.7.20-logging-and-stable-test 1) Let's look at the backtrace where S-lock is set on 5.7.19 slave: ==================bt 1.1============================== #1 0x0000000001ae40bb in lock_rec_lock (impl=false, mode=2, block=0x7fffea66ff60, heap_no=1, index=0x7fff9c017610, thr=0x7fff9c02b300) at ./storage/innobase/lock/lock0lock.cc:2085 #2 0x0000000001aee27d in lock_clust_rec_read_check_and_lock (flags=0, block=0x7fffea66ff60, rec=0x7fffeac58070 "supremumu\210", index=0x7fff9c017610, offsets=0x7ffff10fcd30, mode=LOCK_S, gap_mode=0, thr=0x7fff9c02b300) at ./storage/innobase/lock/lock0lock.cc:6321 #3 0x0000000001b99e22 in row_ins_set_shared_rec_lock (type=0, block=0x7fffea66ff60, rec=0x7fffeac58070 "supremumu\210", index=0x7fff9c017610, offsets=0x7ffff10fcd30, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:1498 #4 0x0000000001b9a55d in row_ins_check_foreign_constraint (check_ref=1, foreign=0x7fff9c034e00, table=0x7fff9c02da90, entry=0x7fff9c02bba8, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:1725 #5 0x0000000001b9ac49 in row_ins_check_foreign_constraints (table=0x7fff9c02da90, index=0x7fff9c0350c0, entry=0x7fff9c02bba8, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:1964 #6 0x0000000001b9e7ac in row_ins_sec_index_entry (index=0x7fff9c0350c0, entry=0x7fff9c02bba8, thr=0x7fff9c02b300, dup_chk_only=false) at ./storage/innobase/row/row0ins.cc:3400 #7 0x0000000001b9ea7b in row_ins_index_entry (index=0x7fff9c0350c0, entry=0x7fff9c02bba8, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:3477 #8 0x0000000001b9efd5 in row_ins_index_entry_step (node=0x7fff9c02b088, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:3625 #9 0x0000000001b9f379 in row_ins (node=0x7fff9c02b088, thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:3767 #10 0x0000000001b9f98d in row_ins_step (thr=0x7fff9c02b300) at ./storage/innobase/row/row0ins.cc:3952 #11 0x0000000001bc0895 in row_insert_for_mysql_using_ins_graph (mysql_rec=0x7fff9c0334b0 "\375\001", prebuilt=0x7fff9c02aae0) at ./storage/innobase/row/row0mysql.cc:2278 #12 0x0000000001bc0e8c in row_insert_for_mysql (mysql_rec=0x7fff9c0334b0 "\375\001", prebuilt=0x7fff9c02aae0) at ./storage/innobase/row/row0mysql.cc:2402 #13 0x0000000001a546da in ha_innobase::write_row (this=0x7fff9c02ee30, record=0x7fff9c0334b0 "\375\001") at ./storage/innobase/handler/ha_innodb.cc:8278 #14 0x0000000000fd7b48 in handler::ha_write_row (this=0x7fff9c02ee30, buf=0x7fff9c0334b0 "\375\001") at ./sql/handler.cc:8434 #15 0x000000000188a1af in write_record (thd=0x7fff9c000950, table=0x7fff9c02e470, info=0x7ffff10fdfb0, update=0x7ffff10fe030) at ./sql/sql_insert.cc:1875 #16 0x0000000001886ff9 in Sql_cmd_insert::mysql_insert (this=0x7fff9c0069f8, thd=0x7fff9c000950, table_list=0x7fff9c006468) at ./sql/sql_insert.cc:769 #17 0x000000000188e139 in Sql_cmd_insert::execute (this=0x7fff9c0069f8, thd=0x7fff9c000950) at ./sql/sql_insert.cc:3117 #18 0x000000000164b4e0 in mysql_execute_command (thd=0x7fff9c000950, first_level=true) at ./sql/sql_parse.cc:3748 #19 0x0000000001651ba6 in mysql_parse (thd=0x7fff9c000950, parser_state=0x7ffff10ff510) at ./sql/sql_parse.cc:5891 #20 0x00000000018d3c7b in Query_log_event::do_apply_event (this=0x7fff9c00eed0, rli=0x3917d20, query_arg=0x7fff9c017219 "INSERT INTO t2 VALUES (1, 100000)", q_len_arg=33) at ./sql/log_event.cc:4718 #21 0x00000000018d2aa3 in Query_log_event::do_apply_event (this=0x7fff9c00eed0, rli=0x3917d20) at ./sql/log_event.cc:4437 #22 0x00000000018cf81a in Log_event::apply_event (this=0x7fff9c00eed0, rli=0x3917d20) at ./sql/log_event.cc:3447 #23 0x000000000194b7e1 in apply_event_and_update_pos (ptr_ev=0x7ffff10ff8a0, thd=0x7fff9c000950, rli=0x3917d20) at ./sql/rpl_slave.cc:4762 #24 0x000000000194cfa9 in exec_relay_log_event (thd=0x7fff9c000950, rli=0x3917d20) at ./sql/rpl_slave.cc:5277 #25 0x0000000001954395 in handle_slave_sql (arg=0x38b8120) at ./sql/rpl_slave.cc:7488 #26 0x0000000001e7cf07 in pfs_spawn_thread (arg=0x7fffa0103970) at ./storage/perfschema/pfs.cc:2188 #27 0x00007ffff6f5e6ba in start_thread (arg=0x7ffff1100700) at pthread_create.c:333 #28 0x00007ffff63f33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 ======================================== The key function here is row_ins_check_foreign_constraint(). If we look into this function we will see the criteria of gap locking: ------------code 1.1-------------------- dberr_t row_ins_check_foreign_constraint(...) { ... bool skip_gap_lock; skip_gap_lock = (trx->isolation_level <= TRX_ISO_READ_COMMITTED); ... } ---------------------------------------- So gap locking is skipped if transaction isolation level is less than or equal to "READ COMMITTED". In other words, gap locking for the procedure of checking foreign keys is enabled only for "REPEATABLE READ" and "SERIALIZABLE" isolation levels. 2) Despite the global transaction isolation level the default transaction isolation level for slave thread is "REPEATABLE READ". It can be changed during relay log events execution. 3) The original binlog format is "mixed". a) For 5.7.19 the statemets "INSERT INTO t2 VALUES (1, 100000)" and "INSERT INTO t1 VALUES (85000, NULL)" inside of different XA transactions are logged in statement based format. As the default slave thread tx isolation level is "REPEATABLE READ" the query event execution for "INSERT INTO t2 VALUES (1, 100000)" causes our supremum S-lock (see (1)). b) For 5.7.20 the situation is different. The first, before binary log events for our two XA transactions there is Gtid_log_event sent by master. If we look into Gtid_log_event::do_apply_event() we will see the following comments: -----------code 3b.1------------------ int Gtid_log_event::do_apply_event(Relay_log_info const *rli) { ... /* If the current transaction contains no changes logged with SBR we can assume this transaction as a pure row based replicated one. Based on this assumption, we can set current transaction tx_isolation to READ COMMITTED in order to avoid concurrent transactions to be blocked by InnoDB gap locks. The session tx_isolation will be restored: - When the transaction finishes with QUERY(COMMIT|ROLLBACK), as the MySQL server does for ordinary user sessions; - When applying a Xid_log_event, after committing the transaction; - When applying a XA_prepare_log_event, after preparing the transaction; - When the applier needs to abort a transaction execution. Notice that when a transaction is being "gtid skipped", its statements are not actually executed (see mysql_execute_command()). So, the call to the function that would restore the tx_isolation after finishing the transaction may not happen. */ if (DBUG_EVALUATE_IF("force_trx_as_rbr_only", true, !may_have_sbr_stmts && thd->tx_isolation > ISO_READ_COMMITTED && gtid_pre_statement_checks(thd) != GTID_STATEMENT_SKIP)) { DBUG_ASSERT(thd->get_transaction()->is_empty(Transaction_ctx::STMT)); DBUG_ASSERT(thd->get_transaction()->is_empty(Transaction_ctx::SESSION)); DBUG_ASSERT(!thd->lock); DBUG_PRINT("info", ("setting tx_isolation to READ COMMITTED")); set_tx_isolation(thd, ISO_READ_COMMITTED, true/*one_shot*/); } ... } ------------------------------------------ As we can see slave thread transaction isolation level is changed during this binlog event execution on some condition. The backtrace for this action is the following: ==========bt 3b.1=========== (gdb) bt #0 set_tx_isolation (thd=0x7fff9c000950, tx_isolation=ISO_READ_COMMITTED, one_shot=true) at ./sql/handler.cc:9180 #1 0x00000000018f6c1d in Gtid_log_event::do_apply_event (this=0x7fff9c016d60, rli=0x3934ac0) at ./sql/log_event.cc:13760 #2 0x00000000018d5628 in Log_event::apply_event (this=0x7fff9c016dd8, rli=0x3934ac0) at ./sql/log_event.cc:3534 #3 0x0000000001954423 in apply_event_and_update_pos (ptr_ev=0x7ffff10ff8a0, thd=0x7fff9c000950, rli=0x3934ac0) at ./sql/rpl_slave.cc:4786 #4 0x0000000001955c89 in exec_relay_log_event (thd=0x7fff9c000950, rli=0x3934ac0) at ./sql/rpl_slave.cc:5310 #5 0x000000000195d1a7 in handle_slave_sql (arg=0x38d5690) at ./sql/rpl_slave.cc:7545 #6 0x0000000001e8ded5 in pfs_spawn_thread (arg=0x7fffa01b2d50) at ./storage/perfschema/pfs.cc:2190 #7 0x00007ffff6f5e6ba in start_thread (arg=0x7ffff1100700) at pthread_create.c:333 #8 0x00007ffff63f33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 ======================================== The backtrace of restoring slave thread isolation level to "REPEATABLE READ" is the following: ===================bt 3b.2================= #0 trans_reset_one_shot_chistics (thd=0x7fff9c000950) at ./sql/transaction.cc:56 #1 0x000000000177b26d in applier_reset_xa_trans (thd=0x7fff9c000950) at ./sql/xa.cc:1327 #2 0x000000000177a104 in Sql_cmd_xa_prepare::execute (this=0x7fff9c0061e8, thd=0x7fff9c000950) at ./sql/xa.cc:835 #3 0x00000000018e2e8d in XA_prepare_log_event::do_commit (this=0x7fff9c02f690, thd=0x7fff9c000950) at ./sql/log_event.cc:7605 #4 0x00000000018e24a9 in Xid_apply_log_event::do_apply_event (this=0x7fff9c02f778, rli=0x3934ac0) at ./sql/log_event.cc:7428 #5 0x00000000018d5628 in Log_event::apply_event (this=0x7fff9c02f778, rli=0x3934ac0) at ./sql/log_event.cc:3534 #6 0x0000000001954423 in apply_event_and_update_pos (ptr_ev=0x7ffff10ff8a0, thd=0x7fff9c000950, rli=0x3934ac0) at ./sql/rpl_slave.cc:4786 #7 0x0000000001955c89 in exec_relay_log_event (thd=0x7fff9c000950, rli=0x3934ac0) at ./sql/rpl_slave.cc:5310 #8 0x000000000195d1a7 in handle_slave_sql (arg=0x38d5690) at ./sql/rpl_slave.cc:7545 #9 0x0000000001e8ded5 in pfs_spawn_thread (arg=0x7fffa01b2d50) at ./storage/perfschema/pfs.cc:2190 #10 0x00007ffff6f5e6ba in start_thread (arg=0x7ffff1100700) at pthread_create.c:333 #11 0x00007ffff63f33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 ===================================== And the second difference between 5.7.20 and 5.7.19 is that binary log events for "INSERT INTO t2 VALUES (1, 100000)" and "INSERT INTO t1 VALUES (85000, NULL)" statements are in row-based format for 5.7.20 and statement-based format for 5.7.19. So the sequence of binary log events execution for 5.7.20 is the following: -----------sequence 3b.1 (for 5.7.20)------------ Gtid_log_event - which changes slave thread isolation level to "READ COMMITTED"; Query_log_event - for XA START 1; Table_map_log_event - row based; Write_rows_log_event - row based, for "INSERT INTO t2 VALUES (1, 100000)"; Query_log_event - for XA END 1; XA_prepare_log_event - for XA PREPARE 1, it changes slave thread tx level to "REPEATABLE READ" (see bt 3b.2); Gtid_log_event - which changes slave thread isolation level to "READ COMMITTED"; Query_log_event - for XA START 2; Table_map_log_event - row based; Write_rows_log_event - row based, for "INSERT INTO t1 VALUES (85000, NULL)"; Query_log_event - for XA END 2; XA_prepare_log_event - for XA PREPARE 2, it changes slave thread tx level to "REPEATABLE READ" (see bt3b.2); ------------------------------ The sequence of binary log events execution for 5.7.19 is the following: --------------sequence 3b.2 (for 5.7.19)------------- Gtid_log_event - does not change slave thread isolation level because the condition "!may_have_sbr_stmts" is not true (see code 3b.1); Query_log_event - for XA START 1; Query_log_event - statement based, for "INSERT INTO t2 VALUES (1, 100000)"; Query_log_event - for XA END 1; XA_prepare_log_event - for XA PREPARE 1; Gtid_log_event - does not change slave thread isolation level because the condition "!may_have_sbr_stmts" is not true (see code 3b.1); Query_log_event - for XA START 2; Query_log_event - statement based, for "INSERT INTO t1 VALUES (85000, NULL)"; Query_log_event - for XA END 2; XA_prepare_log_event - for XA PREPARE 2; -------------------------------- Summing up: This above explains why the test fails for 5.7.19 and passes for 5.7.20. For 5.7.19 binary log events for the statements inside of XA blocks are in statement-based format, while for 5.7.20 they are in row-based format. Gtid_log_event generated by master before binlog events for XA blocks for 5.7.20 contains "rbr_only" flag set to "true", which causes changing slave thread transaction isolation level to "READ COMMITTED", while 5.7.19 master does not set that flag, and slave thread transaction isolation level stays "REPEATABLE READ". As a result there is no supremum S-lock for 5.7.20 during foreign keys check, and for 5.7.19 that lock is caused by corresponding transaction isolation level. The general question is why the bug can be repeated on the customer's environment with 5.7.20 slave. Because master's version is still 5.7.19, and binary log events are generated in the sequence of 5.7.19(see sequence 3b.2).