assertion failure while trying to read InnoDB partition

Bug #870119 reported by z
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Fix Released
High
Alexey Kopytov
1.6
Won't Fix
High
Alexey Kopytov
2.0
Fix Released
High
Alexey Kopytov
2.1
Fix Released
High
Alexey Kopytov

Bug Description

I am consistently getting the following error when trying to backup a partitioned InnoDB table. InnoDB is configured with file_per_table. The particular partition on which the read fails varies. There does not seem to be file corruption since I can stop/restart the database without problem and InnoDB itself reports no problems with the table. It is a large table with ~500 partitions.

InnoDB: Error: tried to read 1048576 bytes at offset 1 60817408.
InnoDB: Was only able to read 0.
InnoDB: Fatal error: cannot read from file. OS error number 17.
111007 0:09:02 InnoDB: Assertion failure in thread 140202726008576 in file os/os0file.c line 2460
InnoDB: We intentionally generate a memory trap.

System configuration is as follows:
Ubuntu Linux 10.04 LTS kernel 2.6.32-25-server x86_64
MySQL Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (x86_64) using readline 6.1
InnoDB Plugin version 1.0.5
xtrabackup version 1.6.3 for Percona Server 5.1.55 unknown-linux-gnu (x86_64) (revision id: undefined)

xtrabackup is invoked with 16 parallel threads.

Tags: innodb

Related branches

Changed in percona-xtrabackup:
assignee: nobody → Valentine Gostev (longbow)
Revision history for this message
z (z0lo) wrote :

I think this bug may be related to threading in xtrabackup. I changed my script to only use 1 thread for xtrabackup and it has completed successfully for several days now, with no assertion failures.

Revision history for this message
z (z0lo) wrote :

The backups have now been running for about 5 days with one thread without failure.

Stewart Smith (stewart)
Changed in percona-xtrabackup:
importance: Undecided → High
Revision history for this message
z (z0lo) wrote :

one month of single threaded backups, no failures

Revision history for this message
Patrick Crews (patrick-crews) wrote :

Thank you for the updates on this. We'll try to get a repeatable test case asap.

Revision history for this message
Valentine Gostev (longbow) wrote :

Hi Matt,

can you please provide show create table output for this partitioned table?
Also mysqld config might help.

Thank you

Revision history for this message
z (z0lo) wrote :

Here is the create table statement

Revision history for this message
z (z0lo) wrote :

here is our my.cnf

Revision history for this message
Alexey Kopytov (akopytov) wrote :

We've been able to reproduce the problem.

The reason is that the InnoDB file I/O subsystem may reuse file descriptors by closing the old ones when the number of open files hits innodb_open_files. Which works for InnoDB, because if InnoDB needs to access a table which has been closed, it would just reopen it.

However, that doesn't work for XtraBackup, since it only keeps a file descriptor when copying a file. So when the --parallel option is used, there's a chance that another thread wants to open a file and hits innodb_open_files. So fil_try_to_close_file_in_LRU() may close a file descriptor which is currently being in use by another thread and then this descriptor is shortly reused when opening another file. Which would result in obscure failures like this.

Another important part to this problem is the fact the XtraBackup leaks file descriptors. Which is bug #713267. But even after that bug is fixed, there will still be a possibility to hit this bug, but setting a very low value of innodb_open_files for XtraBackup, and then using a very high --parallel value. So what needs to be done to fix this in addition to fixing bug #713267, is to fail when XtraBackup hits the innodb_open_files limit, rather than follow the default InnoDB behavior and close some random files.

Changed in percona-xtrabackup:
status: New → Triaged
Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

Alexey,

Should we just mention in documentation that if you run with --parallel
you need also to specify big values (how big?) for innodb_open_files?
Will be that enough workaround?

Revision history for this message
z (z0lo) wrote :

Great to see you were able to reproduce and track down this bug!!

-Matt

Revision history for this message
Alexey Kopytov (akopytov) wrote : Re: [Bug 870119] Re: assertion failure while trying to read InnoDB partition

On Thu, 26 Apr 2012 15:31:13 -0000, Vadim Tkachenko wrote:
> Should we just mention in documentation that if you run with --parallel
> you need also to specify big values (how big?) for innodb_open_files?
> Will be that enough workaround?
>

I think once bug #713267 is fixed, we should make XtraBackup
automatically set innodb_open_files to be able run with the specified
--parallel value. Which will be a fix for this bug.

Revision history for this message
Stewart Smith (stewart) wrote :

As 1.6 is the old stable release, I don't think we need to fix there unless somebody explicitly needs it. If you do need a fix for this bug, contact us (Percona) and we can sort something out (or post here).

Revision history for this message
z (z0lo) wrote :

okay

Matt

On Jun 15, 2012, at 4:03 AM, Stewart Smith wrote:

> As 1.6 is the old stable release, I don't think we need to fix there
> unless somebody explicitly needs it. If you do need a fix for this bug,
> contact us (Percona) and we can sort something out (or post here).
>
> ** Changed in: percona-xtrabackup/1.6
> Milestone: 1.6.7 => None
>
> ** Changed in: percona-xtrabackup/1.6
> Status: Triaged => Won't Fix
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/870119
>
> Title:
> assertion failure while trying to read InnoDB partition
>
> Status in Percona XtraBackup:
> Triaged
> Status in Percona XtraBackup 1.6 series:
> Won't Fix
> Status in Percona XtraBackup 2.0 series:
> Triaged
>
> Bug description:
> I am consistently getting the following error when trying to backup a
> partitioned InnoDB table. InnoDB is configured with file_per_table.
> The particular partition on which the read fails varies. There does
> not seem to be file corruption since I can stop/restart the database
> without problem and InnoDB itself reports no problems with the table.
> It is a large table with ~500 partitions.
>
>
> InnoDB: Error: tried to read 1048576 bytes at offset 1 60817408.
> InnoDB: Was only able to read 0.
> InnoDB: Fatal error: cannot read from file. OS error number 17.
> 111007 0:09:02 InnoDB: Assertion failure in thread 140202726008576 in file os/os0file.c line 2460
> InnoDB: We intentionally generate a memory trap.
>
>
> System configuration is as follows:
> Ubuntu Linux 10.04 LTS kernel 2.6.32-25-server x86_64
> MySQL Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (x86_64) using readline 6.1
> InnoDB Plugin version 1.0.5
> xtrabackup version 1.6.3 for Percona Server 5.1.55 unknown-linux-gnu (x86_64) (revision id: undefined)
>
> xtrabackup is invoked with 16 parallel threads.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/percona-xtrabackup/+bug/870119/+subscriptions

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-312

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.