We tried to mitigate this behavior by adding innodb_lock_wait_timeout=900 slave_transaction_retries=100000 and it was successful until today.
Otherwise any really large query causes system lock on the replication thread and then this can happen.
Found something similar at https://www.kickstarter.com/backing-and-hacking/the-day-the-replication-died
and https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1214465, but we have wsrep_forced_binlog_format=NONE and pt-table-checksum runs twice a day to check galera cluster and replication slaves and it's OK.
This is repeatable error.
We tried to mitigate this behavior by adding lock_wait_ timeout= 900 on_retries= 100000
innodb_
slave_transacti
and it was successful until today.
Otherwise any really large query causes system lock on the replication thread and then this can happen.
Found something similar at /www.kickstarte r.com/backing- and-hacking/ the-day- the-replication -died
https:/
and https:/ /bugs.launchpad .net/percona- xtradb- cluster/ +bug/1214465, but we have binlog_ format= NONE and pt-table-checksum runs twice a day to check galera cluster and replication slaves and it's OK.
wsrep_forced_
This is repeatable error.