Percona XtraBackup moved to https://jira.percona.com/projects/PXB

--apply-log stops at 99%

Bug #354817 reported by Brad Fino on 2009-04-03

Affects		Status	Importance	Assigned to	Milestone
	Percona XtraBackup moved to https://jira.percona.com/projects/PXB	Fix Released	Undecided	Unassigned

Bug Description

On xtrabackup 0.4 rpm for RHEL5, when running innobackupex --apply-log --use-memory=16000, it gets to 99% and then just hangs. I left it there for a good 12 hours and when I came back it hadn't budged ... xtrabackup is sitting in the processlist pegged at 100% CPU. Here's an strace starting from the last output of apply-log:

[pid 6765] write(2, "InnoDB: Doing recovery: scanned "..., 79InnoDB: Doing recovery: scanned up to log sequence number 80 1103001814 (99 %)
) = 79
[pid 6765] pread(8, "( \337=\0\326\0002\0\6TM\300\366\24\0\2\0\2\200\4\200\10\21\6\2\235\16\0\201v$"..., 512, 520458240) = 512
[pid 6765] futex(0x2082cd2c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x2082ccf0, 0 <unfinished ...>
[pid 6768] <... futex resumed> ) = 0
[pid 6768] futex(0x2082ccf0, FUTEX_WAIT, 2, NULL <unfinished ...>
[pid 6765] <... futex resumed> ) = 1
[pid 6765] futex(0x2082ccf0, FUTEX_WAKE, 1 <unfinished ...>
[pid 6768] <... futex resumed> ) = 0
[pid 6768] futex(0x2082ccf0, FUTEX_WAKE, 1) = 0
[pid 6768] pwrite(8, "\0\0\0\0\0\6S\247\0\0\0P\"\270\361>\0\0\t>\1\0\0\0\377\377\377\377\377\377\377\377"..., 512, 1536) = 512
[pid 6768] fsync(8 <unfinished ...>
[pid 6765] <... futex resumed> ) = 1
[pid 6765] sched_yield() = 0
[pid 6765] futex(0x3dd0192c, FUTEX_WAIT, 1, NULL <unfinished ...>
[pid 6768] <... fsync resumed> ) = 0
[pid 6768] futex(0x3dd0192c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x3dd018f0, 0) = 1
[pid 6768] futex(0x2082cd2c, FUTEX_WAIT, 3, NULL <unfinished ...>
[pid 6765] <... futex resumed> ) = 0
[pid 6765] futex(0x3dd018f0, FUTEX_WAKE, 1) = 0
[pid 6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf64373000
[pid 6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf666b3000
[pid 6765] mmap(NULL, 147841024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf689f3000
[pid 6765] open("/tmp/ibI2XSSt", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
[pid 6765] unlink("/tmp/ibI2XSSt") = 0
[pid 6765] dup(9) = 10
[pid 6765] close(9) = 0
[pid 6765] fcntl(10, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 6765] fstat(10, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
[pid 6765] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf716f1000
[pid 6765] lseek(10, 0, SEEK_CUR) = 0
[pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 114688) = 16384
[pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 16384) = 16384
[pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384
[pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 0) = 16384
[pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 65536) = 16384

At this point it just sits there. I've taken a 2nd backup from a 2nd DB server and have the same results. Backups were taken via tar stream over ssh.

Revision history for this message

Vadim Tkachenko (vadim-tk) wrote on 2009-04-03: Re: [Bug 354817] [NEW] --apply-log stops at 99%

Download full text (3.7 KiB)

Brad,

It should be fixed in recent Launchpad changesets.

I will prepare new release in two days if you can't try bzr tree by some
reason.

Brad Fino wrote:
> Public bug reported:
>
> On xtrabackup 0.4 rpm for RHEL5, when running innobackupex --apply-log
> --use-memory=16000, it gets to 99% and then just hangs. I left it there
> for a good 12 hours and when I came back it hadn't budged ...
> xtrabackup is sitting in the processlist pegged at 100% CPU. Here's an
> strace starting from the last output of apply-log:
>
> [pid 6765] write(2, "InnoDB: Doing recovery: scanned "..., 79InnoDB: Doing recovery: scanned up to log sequence number 80 1103001814 (99 %)
> ) = 79
> [pid 6765] pread(8, "( \337=\0\326\0002\0\6TM\300\366\24\0\2\0\2\200\4\200\10\21\6\2\235\16\0\201v$"..., 512, 520458240) = 512
> [pid 6765] futex(0x2082cd2c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x2082ccf0, 0 <unfinished ...>
> [pid 6768] <... futex resumed> ) = 0
> [pid 6768] futex(0x2082ccf0, FUTEX_WAIT, 2, NULL <unfinished ...>
> [pid 6765] <... futex resumed> ) = 1
> [pid 6765] futex(0x2082ccf0, FUTEX_WAKE, 1 <unfinished ...>
> [pid 6768] <... futex resumed> ) = 0
> [pid 6768] futex(0x2082ccf0, FUTEX_WAKE, 1) = 0
> [pid 6768] pwrite(8, "\0\0\0\0\0\6S\247\0\0\0P\"\270\361>\0\0\t>\1\0\0\0\377\377\377\377\377\377\377\377"..., 512, 1536) = 512
> [pid 6768] fsync(8 <unfinished ...>
> [pid 6765] <... futex resumed> ) = 1
> [pid 6765] sched_yield() = 0
> [pid 6765] futex(0x3dd0192c, FUTEX_WAIT, 1, NULL <unfinished ...>
> [pid 6768] <... fsync resumed> ) = 0
> [pid 6768] futex(0x3dd0192c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x3dd018f0, 0) = 1
> [pid 6768] futex(0x2082cd2c, FUTEX_WAIT, 3, NULL <unfinished ...>
> [pid 6765] <... futex resumed> ) = 0
> [pid 6765] futex(0x3dd018f0, FUTEX_WAKE, 1) = 0
> [pid 6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf64373000
> [pid 6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf666b3000
> [pid 6765] mmap(NULL, 147841024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf689f3000
> [pid 6765] open("/tmp/ibI2XSSt", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
> [pid 6765] unlink("/tmp/ibI2XSSt") = 0
> [pid 6765] dup(9) = 10
> [pid 6765] close(9) = 0
> [pid 6765] fcntl(10, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
> [pid 6765] fstat(10, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> [pid 6765] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf716f1000
> [pid 6765] lseek(10, 0, SEEK_CUR) = 0
> [pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 114688) = 16384
> [pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 16384) = 16384
> [pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384
> [pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 0) = 16384
> [pid 6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\...

Brad,

It should be fixed in recent Launchpad changesets.

I will prepare new release in two days if you can't try bzr tree by some
reason.

Brad Fino wrote:
> Public bug reported:
> 
> On xtrabackup 0.4 rpm for RHEL5, when running innobackupex --apply-log
> --use-memory=16000, it gets to 99% and then just hangs.  I left it there
> for a good 12 hours and when I came back it hadn't budged ...
> xtrabackup is sitting in the processlist pegged at 100% CPU.  Here's an
> strace starting from the last output of apply-log:
> 
> [pid  6765] write(2, "InnoDB: Doing recovery: scanned "..., 79InnoDB: Doing recovery: scanned up to log sequence number 80 1103001814 (99 %)
> ) = 79
> [pid  6765] pread(8, "( \337=\0\326\0002\0\6TM\300\366\24\0\2\0\2\200\4\200\10\21\6\2\235\16\0\201v$"..., 512, 520458240) = 512
> [pid  6765] futex(0x2082cd2c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x2082ccf0, 0 <unfinished ...>
> [pid  6768] <... futex resumed> )       = 0
> [pid  6768] futex(0x2082ccf0, FUTEX_WAIT, 2, NULL <unfinished ...>
> [pid  6765] <... futex resumed> )       = 1
> [pid  6765] futex(0x2082ccf0, FUTEX_WAKE, 1 <unfinished ...>
> [pid  6768] <... futex resumed> )       = 0
> [pid  6768] futex(0x2082ccf0, FUTEX_WAKE, 1) = 0
> [pid  6768] pwrite(8, "\0\0\0\0\0\6S\247\0\0\0P\"\270\361>\0\0\t>\1\0\0\0\377\377\377\377\377\377\377\377"..., 512, 1536) = 512
> [pid  6768] fsync(8 <unfinished ...>
> [pid  6765] <... futex resumed> )       = 1
> [pid  6765] sched_yield()               = 0
> [pid  6765] futex(0x3dd0192c, FUTEX_WAIT, 1, NULL <unfinished ...>
> [pid  6768] <... fsync resumed> )       = 0
> [pid  6768] futex(0x3dd0192c, FUTEX_CMP_REQUEUE, 1, 2147483647, 0x3dd018f0, 0) = 1
> [pid  6768] futex(0x2082cd2c, FUTEX_WAIT, 3, NULL <unfinished ...>
> [pid  6765] <... futex resumed> )       = 0
> [pid  6765] futex(0x3dd018f0, FUTEX_WAKE, 1) = 0
> [pid  6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf64373000
> [pid  6765] mmap(NULL, 36962304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf666b3000
> [pid  6765] mmap(NULL, 147841024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf689f3000
> [pid  6765] open("/tmp/ibI2XSSt", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
> [pid  6765] unlink("/tmp/ibI2XSSt")     = 0
> [pid  6765] dup(9)                      = 10
> [pid  6765] close(9)                    = 0
> [pid  6765] fcntl(10, F_GETFL)          = 0x8002 (flags O_RDWR|O_LARGEFILE)
> [pid  6765] fstat(10, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> [pid  6765] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf716f1000
> [pid  6765] lseek(10, 0, SEEK_CUR)      = 0
> [pid  6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 114688) = 16384
> [pid  6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 16384) = 16384
> [pid  6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384
> [pid  6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 0) = 16384
> [pid  6765] pread(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 65536) = 16384
> 
> At this point it just sits there.   I've taken a 2nd backup from a 2nd
> DB server and have the same results.  Backups were taken via tar stream
> over ssh.
> 
> ** Affects: percona-xtrabackup
>      Importance: Undecided
>          Status: New
>

-- 
Vadim Tkachenko, CTO
Percona Inc.
ICQ: 369-510-335, Skype: vadimtk153, Phone +1-888-401-3403
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.percona.com/

Attend the 2009 Percona Performance Conference
  April 22-23 - http://conferences.percona.com/

Revision history for this message

Yasufumi Kinoshita (yasufumi-kinoshita) wrote on 2009-04-09:

Thank you.

It may be caused by some loss record of xtrabackup_logfile .
And it has been already fixed at

http://bazaar.launchpad.net/~percona-dev/percona-xtrabackup/trunk/revision/31
as fix of bug 352742

Changed in percona-xtrabackup:
assignee:	nobody → yasufumi-kinoshita
status:	New → Fix Committed

Yasufumi Kinoshita (yasufumi-kinoshita) on 2009-04-10

Changed in percona-xtrabackup:
status:	Fix Committed → Fix Released

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-20:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-1045

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.