Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

Bug #1522385
Comment #3

Comment 3 for bug 1522385

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2015-12-17:

Some more investigational notes:
Workload is executed on async master. (delete from t where i = 2)
Given that the same workload is already executed independently on async slave re-execution of it by slave thread on async slave will not have any affect.
(Ideally, it should cause SLAVE thread to stop as it has detected in-consistency)
User has configured async-slave to ignore error from this action of async slave allowing trx to proceed (instead of causing it to fail and rollback)
Here is the relevant code function in SLAVE thread that handles masking of error returned.
Rows_log_event::handle_idempotent_and_ignored_errors() ..... sql/log_event.cc
So even if the trx fails it is allowed to proceed as usual which in turn causes increment of MySQL GTID:seqno.
This being a out-of-course action (error correction path) it is not replicated by Galera.
(Action at SE level has failed and so nothing was written to bin-log for Galera to replicate and corrective action which is slave specific
is then applied in SLAVE space after SE confirm the error.)
Another interesting aspect is if I change master workload from point query to "delete from t where i >=2" and keep rest of the things same
given that async slave will fail for first delete condition itself rest of the condition is not executed (that is if followup rows let's say (3), (4) can be deleted but aren't
causing further inconsistency). [Now MASTER has rows with i < 2 .... but SLAVE has rows with i > 2]
(Well trying i <=2 will have different effect as SLAVE will hit error condition at last so object <=2 will be deleted)
In short, if user try to skip such errors then he/she is better aware of the consequence or inconsistency it may cause.