SQL_THREAD left in stopped state of safe-slave-backup-timeout is reached

Reported by Jervin R on 2012-08-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup
Medium
Alexey Kopytov
2.0
Medium
Alexey Kopytov
2.1
Medium
Alexey Kopytov

Bug Description

When using safe-slave-backup and safe-slave-backup-timeout is reached, SQL_THREAD is left in stopped state causing the slave thread to lag behind.

Related to https://bugs.launchpad.net/percona-xtrabackup/+bug/1016714 but that one has few other things discussed, figured this one should be separate instead.

Here is a quick patch to restart the slave before the script dies on 2.0.1

2587a2588,2589
> mysql_send 'START SLAVE SQL_THREAD;';
>

tags: added: contribution
Jervin R (revin) wrote :

Can we get this up on next release? Thanks!

See internal i25625

tags: added: i25625
tags: removed: i25625
Vadim Tkachenko (vadim-tk) wrote :

well,
the fix is not full, we need to check if SQL_THREAD was running at the start of backup, and only in that case to send "START SQL_THREAD"

tags: added: i25625
Alexey Kopytov (akopytov) wrote :

It's safe to send "START SLAVE SQL_THREAD" unconditionally when the timeout is reached, as innobackupex runs START/STOP SLAVE SQL_THREAD in a loop in wait_for_safe_slave before the timeout anyway.

Alexey Kopytov (akopytov) wrote :

On a second thought, we do need to check, because the slave thread indeed might be stopped with a non-zero slave_open_temp_tables before taking a backup.

markus_albe (markus-albe) wrote :

Will the check and conditional START SLAVE SQL_THREAD be implemented? I got this behavior with 2.1.5... If not, documentation here http://www.percona.com/doc/percona-xtrabackup/2.1/innobackupex/innobackupex_option_reference.html#cmdoption-innobackupex--safe-slave-backup should make it explicit that is possible for failed backup to leave SQL thread stopped.

Alexey Kopytov (akopytov) wrote :

Markus,

This bug was about leaving the SQL thread stopped when safe-slave-backup-timeout. If innobackupex fails between SQL thread stop and restart, it will still be left in the stopped state even in 2.1.5.

Reported separately as bug #1248835. Please let me know if I should prioritize it in case it is affecting a customer.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers