4. new WS arrives with seqno 312548 . Certification is OK and slave applier starts applying. Note, that replaying victim has passed cert queue, so this later remote WS can proceed. Note also, that this remote trx has seen our victim trx, and therefore does not conflict with it.
5. 312546 must have left commit TO (although not visible here), and replaying victim, can grab commit TO. job queue has our victim and slave applier 312548, running
100103 0:22:25 [ERROR] Slave SQL: Could not execute Update_rows event on table test.sbtest; Can't find record in 'sbtest', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
6. ...and then this error. This means that 312548 has deleted the row, that 312547 tries to update. This can happen, because 312548 has seen the effects of 312547, and passes certification
Log analysis reveals following:
100103 0:22:25 [Note] WSREP: BF kill, with seqno: 312545
100103 0:22:25 [Note] WSREP: kill trx QUERY_COMMITTING for 31279080
1. Slave applier with seqno 312545, needs to abort local trx which is in committing phase
100103 0:22:25 [Note] [Debug] WSREP: mm_galera. c:mm_galera_ pre_commit( ):1945: i
nterrupted in commit queue for 312547
2. victim has already replicated and is in commit queue, victim's seqno is 312547
100103 0:22:25 [Note] [Debug] WSREP: mm_galera. c:mm_galera_ replay_ trx():2332: replaying applier in to grab: 1, seqno: 27609851 312547
3. victim starts replaying and enters commit queue again
100103 0:22:25 [Note] [Debug] WSREP: mm_galera. c:process_ query_write_ set():976: remote trx seqno: 27609852 312548 last_seen_trx: 27609851 27609852, cert: 0 job_queue. c:job_queue_ start_job( ):162: job: 0 starting
100103 0:22:25 [Note] [Debug] WSREP: galera_
4. new WS arrives with seqno 312548 . Certification is OK and slave applier starts applying. Note, that replaying victim has passed cert queue, so this later remote WS can proceed. Note also, that this remote trx has seen our victim trx, and therefore does not conflict with it.
100103 0:22:25 [Note] [Debug] WSREP: mm_galera. c:ws_conflict_ check() :183: conflict check for: job1: 312547 type: 1 job2: 312548 type 0 job_queue. c:job_queue_ start_job( ):162: job: 1 starting
100103 0:22:25 [Note] [Debug] WSREP: galera_
5. 312546 must have left commit TO (although not visible here), and replaying victim, can grab commit TO. job queue has our victim and slave applier 312548, running
100103 0:22:25 [ERROR] Slave SQL: Could not execute Update_rows event on table test.sbtest; Can't find record in 'sbtest', Error_code: 1032; handler error HA_ERR_ KEY_NOT_ FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
6. ...and then this error. This means that 312548 has deleted the row, that 312547 tries to update. This can happen, because 312548 has seen the effects of 312547, and passes certification