Comment 1 for bug 1421282

Revision history for this message
Thomas Roog (thomas.roog) wrote :

We tried to mitigate this behavior by adding
innodb_lock_wait_timeout=900
slave_transaction_retries=100000
and it was successful until today.

Otherwise any really large query causes system lock on the replication thread and then this can happen.

Found something similar at
https://www.kickstarter.com/backing-and-hacking/the-day-the-replication-died

and https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1214465, but we have
wsrep_forced_binlog_format=NONE and pt-table-checksum runs twice a day to check galera cluster and replication slaves and it's OK.

This is repeatable error.