innobackupex does not detect lock errors

Bug #1022874 reported by pservit
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Expired
Undecided
Unassigned
2.0
Expired
Undecided
Unassigned
2.1
Expired
Undecided
Unassigned

Bug Description

OS: Linux db 2.6.32-220.13.1.el6.x86_64 #1 SMP Tue Apr 17 23:56:34 BST 2012 x86_64 x86_64 x86_64 GNU/Linux

Percona-XtraDB-Cluster-server-5.5.24-23.6.340.rhel6.x86_64
percona-xtrabackup-2.0.1-446.rhel6.x86_64

after lock all tables error backup enters an endless loop:

>> log scanned up to (13444798386645)

120709 09:36:49 innobackupex: Continuing after ibbackup has suspended
120709 09:36:49 innobackupex: Starting mysql with options: --defaults-file='/etc/mysql/my.cnf' --password=xxxxxxxx --user='galera' --unbuffered --
120709 09:36:49 innobackupex: Connected to database with mysql child process (pid=13930)
120709 09:36:51 innobackupex: Starting to lock all tables...
>> log scanned up to (13444798539790)
innobackupex: Error: mysql child process has died: ERROR 1205 (HY000) at line 7: Lock wait timeout exceeded; try restarting transaction
 while waiting for reply to MySQL request: 'FLUSH TABLES WITH READ LOCK;' at /usr//bin/innobackupex line 374.
>> log scanned up to (13444798601871)
>> log scanned up to (13444798633206)
>> log scanned up to (13444798650530)
>> log scanned up to (13444798663011)
>> log scanned up to (13444798681473)
>> log scanned up to (13444798694323)
>> log scanned up to (13444798708209)
>> log scanned up to (13444798735076)
>> log scanned up to (13444798826121)
>> log scanned up to (13444798854510)
>> log scanned up to (13444798869919)
>> log scanned up to (13444798893306)
>> log scanned up to (13444798912747)
>> log scanned up to (13444799007107)

Revision history for this message
Steve Fatula (rs26) wrote :

Happens to me as well:

120709 14:00:54 innobackupex: Continuing after ibbackup has suspended
120709 14:00:54 innobackupex: Starting mysql with options: --defaults-file='/etc/my.cnf' --password=xxxxxxxx --user='bkpuser' --unbuffered --
120709 14:00:54 innobackupex: Connected to database with mysql child process (pid=23529)
>> log scanned up to (689 3445237648)
120709 14:00:56 innobackupex: Starting to lock all tables...
>> log scanned up to (689 3445237648)
innobackupex: Error: mysql child process has died: ERROR 1160 (08S01) at line 7: Got an error writing communication packets
 while waiting for reply to MySQL request: 'FLUSH TABLES WITH READ LOCK;' at /usr/bin/innobackupex line 374.

The user certainly has permissions to do the flush, and, I did so using that user within mysql client, so, I thought it might have been that but it is not.

This is with 100% innodb.

Revision history for this message
tenma (sima-yi) wrote :

This affects me also. Multiple servers.
CentOS 5.7 X86_64.
350MB database.
Also 100% innodb with full permissions.

I have tried leaving it running for over 24 hours and it keeps looping:
...
...
[01] Compressing and streaming ./dbname/REIGHT.ibd
[01] ...done
[01] Compressing and streaming ./dbname/RNINE.ibd
[01] ...done
[01] Compressing and streaming ./dbname/RTEN.ibd
[01] ...done
>> log scanned up to (13198639658)
>> log scanned up to (13198639658)
>> log scanned up to (13198639658)
>> log scanned up to (13198639658)
>> log scanned up to (13198639658)
-Infinite loop starts here.
The actual backup file (xbstream) was not updated since 2 minutes after I fired off the backup.

We will have to revert all servers back to mysqldump until this is sorted.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

tenma,

did you have any timeout-related messages in the log? If not, it's probably a separate issue.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

In case there are no errors messages in the log, the problem is likely the same as in bug #1007446 (i.e. stale xtrabackup_suspended file in $tmpdir).

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Could not reproduce against XtraBackup 2.0.5. Steps taken:
edit innobackupex and change my $mysql_response_timeout = 900; to my $mysql_response_timeout = 10; around line 57.
start mysql session and run select count(*)+sleep(1200) from sometable;
run innobackupex --defaults-file=... .

innobackupex goes through its motions and gets to:
innobackupex: Starting to lock all tables...
>> log scanned up to (8599132)
>> log scanned up to (8599132)
>> log scanned up to (8599132)
innobackupex: Error: Connection to mysql child process (pid=13822) timedout. (Time limit of 10 seconds exceeded. You may adjust time limit by editing the value of parameter "$mysql_response_timeout" in this script.) while waiting for reply to MySQL request: 'FLUSH TABLES WITH READ LOCK;' at ./innobackupex line 381.

It times out after the expected 10 seconds and innobackupex terminates.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona XtraBackup 2.1 because there has been no activity for 60 days.]

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona XtraBackup 2.0 because there has been no activity for 60 days.]

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-3

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.