Provide alternative to --no-lock that stops slave but does not lock tables

Bug #792407 reported by Henrik Ingo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Fix Released
Medium
Rodrigo Gadea

Bug Description

Hi

I've seen some backups spuriously fail with an error message:
innobackupex: Error: Connection to mysql child process (pid=15332) timedout. (Time limit of 900 seconds exceeded. You may adjust time limit by editing the value of parameter "$mysql_response_timeout" in this script.) while waiting for reply to MySQL request: 'FLUSH TABLES WITH READ LOCK;' at /usr/bin/innobackupex line 336.

This is in the last phase where xtrabackup has copied all InnoDB table spaces, and is suspended so that innobackupex should lock all tables and backup MyISAM tables and other non-InnoDB tables, .frm files and such. It seems taking a global lock fails simply because there are updates and inserts happening during this time.

There's the --no-lock option that would allow innobackupex to proceed without trying to lock. This is safe if you are not writing to any MyISAM (or other) tables, and not doing any DDL (.frm files). These assumptions hold for us and indeed, taking a lock on all tables is a bit hard handed, when 99% of the backup is already done in online fashion by xtrabackup.

However, the problem is that we use replication, so the replication state will be inconsistent. In fact, with --no-lock the slave_info and binlog_info files are not written out at all (those are done inside mysql_lockall().

It would be useful for us to have a new option (like --no-lock-but-stop-slave) that would not lock tables, but would still do STOP SLAVE SQL_THREAD where it currently does mysql_lockall(). And it should then of course proceed to write slave_info and binlog_info and other files as usual.

Relevant code in innobackupex are lines 370-... and 1180-1210.

Would this work and be safe?

Revision history for this message
Henrik Ingo (hingo) wrote :

Alternatively, the current --no-lock could be changed to behave as I suggest. I don't know if there are people that need to keep the slave applying.

Revision history for this message
Henrik Ingo (hingo) wrote :

Answering my own question: STOP SLAVE doesn't help you with producing the binlog_info file, the binlog position on the server we are at will of course be moving if tables are not locked. But it does help in producing the slave_info file, the relay log position is standing still while slave sql thread is stopped.

Revision history for this message
Henrik Ingo (hingo) wrote :

Reading more of the code, I realize now that --safe-slave-backup actually does what I want. I can then use it with or without --no-lock as I prefer.

The original reason for creating --safe-slave-backup is something else, so I didn't realize that it actually does a stop slave sql_thread, exactly as I want.

I will close this bug then.

Changed in percona-xtrabackup:
status: New → Invalid
Revision history for this message
Henrik Ingo (hingo) wrote :

(Apparently I can't select Won't fix status.)

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

This actually should be properly documented - both option and use case.
I assign to Rodrigo to make sure it is.

Changed in percona-xtrabackup:
assignee: nobody → Rodrigo Gadea (rodrigo-gadea-percona)
importance: Undecided → Medium
milestone: none → 1.7.0
status: Invalid → Triaged
Revision history for this message
Henrik Ingo (hingo) wrote :

Thanks Vadim

If you're looking into it, then I think this is a good moment to ask: Why isn't --safe-slave-backup on by default?

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote : Re: [Bug 792407] Re: Provide alternative to --no-lock that stops slave but does not lock tables

Henrik,

To be honest, we did not put much thought into this option,
it was implemented by customer request, but we did not think
to make it default.

On Mon, Jun 6, 2011 at 11:48 PM, Henrik Ingo <email address hidden> wrote:
> Thanks Vadim
>
> If you're looking into it, then I think this is a good moment to ask:
> Why isn't --safe-slave-backup on by default?
>
> --
> You received this bug notification because you are a member of Percona
> developers, which is the registrant for Percona XtraBackup.
> https://bugs.launchpad.net/bugs/792407
>
> Title:
>  Provide alternative to --no-lock that stops slave but does not lock
>  tables
>
> Status in Percona XtraBackup:
>  Triaged
>
> Bug description:
>  Hi
>
>  I've seen some backups spuriously fail with an error message:
>  innobackupex: Error: Connection to mysql child process (pid=15332) timedout. (Time limit of 900 seconds exceeded. You may adjust time limit by editing the value of parameter "$mysql_response_timeout" in this script.) while waiting for reply to MySQL request: 'FLUSH TABLES WITH READ LOCK;' at /usr/bin/innobackupex line 336.
>
>  This is in the last phase where xtrabackup has copied all InnoDB table
>  spaces, and is suspended so that innobackupex should lock all tables
>  and backup MyISAM tables and other non-InnoDB tables, .frm files and
>  such. It seems taking a global lock fails simply because there are
>  updates and inserts happening during this time.
>
>  There's the --no-lock option that would allow innobackupex to proceed
>  without trying to lock. This is safe if you are not writing to any
>  MyISAM (or other) tables, and not doing any DDL (.frm files). These
>  assumptions hold for us and indeed, taking a lock on all tables is a
>  bit hard handed, when 99% of the backup is already done in online
>  fashion by xtrabackup.
>
>  However, the problem is that we use replication, so the replication
>  state will be inconsistent. In fact, with --no-lock the slave_info and
>  binlog_info files are not written out at all (those are done inside
>  mysql_lockall().
>
>  It would be useful for us to have a new option (like --no-lock-but-
>  stop-slave) that would not lock tables, but would still do STOP SLAVE
>  SQL_THREAD where it currently does mysql_lockall(). And it should then
>  of course proceed to write slave_info and binlog_info and other files
>  as usual.
>
>  Relevant code in innobackupex are lines 370-... and 1180-1210.
>
>  Would this work and be safe?
>

--
Vadim Tkachenko, CTO, Percona Inc.
Phone +1-888-401-3403,  Skype: vadimtk153
Schedule meeting: http://tungle.me/VadimTkachenko

Flat-rate 24x7 support for MySQL <http://percona.com/mysql-support>

Changed in percona-xtrabackup:
status: Triaged → In Progress
Changed in percona-xtrabackup:
status: In Progress → Fix Committed
Revision history for this message
Henrik Ingo (hingo) wrote :

Hi Rodrigo

Can you link your patch to this bug so I can review what was done?

Revision history for this message
Rodrigo Gadea (rodrigo-gadea-percona) wrote :
Changed in percona-xtrabackup:
status: Fix Committed → Fix Released
Revision history for this message
Henrik Ingo (hingo) wrote :

http://www.percona.com/doc/percona-xtrabackup/innobackupex/replication_ibk.html

"Using this option is always recommended when taking backups from a master server."

Maybe I forgot something now, but why? Shouldn't matter either way on a master?

http://www.percona.com/doc/percona-xtrabackup/innobackupex/innobackupex_option_reference.html#cmdoption-innobackupex--no-lock

I think here you should add a sentence that hints to using --safe-slave-backup:

"If you are considering to use --no-lock because your backups are failing to acquire the lock, this could be because of incoming replication events preventing the lock from succeeding. Please try using --safe-slave-backup to momentarily stop the replication slave thread, this may help the backup to succeed and you then don't need to resort to using --no-lock.

Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-573

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.