tar4ibd reads files inefficiently

Bug #899931 reported by Alexey Kopytov
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Invalid
Undecided
Unassigned
1.6
Won't Fix
Undecided
Unassigned
2.0
Invalid
Undecided
Unassigned

Bug Description

tar4ibd is using very inefficient way to read files. To support compressed tables, it starts with a 1KB block size and calls read() with that size. If page verification fails, it increases the block size, calls lseek() to restore the original position and retries read().

It can be optimized in two ways (basically, by doing what xtrabackup binary does):

- read in large chunks, and then process them with different page sizes if necessary;
- don't consider small block sizes and default to 16KB when zip_size == 0, when the data file is not compressed.

I'm not sure it makes sense to fix this in 1.6, as tar4ibd is hopefully going away in 1.7. But reporting it just for the record.

Revision history for this message
Stewart Smith (stewart) wrote : Re: [Bug 899931] [NEW] tar4ibd reads files inefficiently

On Sun, 04 Dec 2011 14:53:09 -0000, Alexey Kopytov <email address hidden> wrote:
> Public bug reported:
>
> tar4ibd is using very inefficient way to read files. To support
> compressed tables, it starts with a 1KB block size and calls read() with
> that size. If page verification fails, it increases the block size,
> calls lseek() to restore the original position and retries read().
>
> It can be optimized in two ways (basically, by doing what xtrabackup
> binary does):
>
> - read in large chunks, and then process them with different page sizes if necessary;
> - don't consider small block sizes and default to 16KB when zip_size == 0, when the data file is not compressed.
>
> I'm not sure it makes sense to fix this in 1.6, as tar4ibd is hopefully
> going away in 1.7. But reporting it just for the record.

maybe posix_fadvise(WILLNEED) ?

--
Stewart Smith

Revision history for this message
Alexey Kopytov (akopytov) wrote :

On 05.12.11 3:57, Stewart Smith wrote:
> maybe posix_fadvise(WILLNEED) ?
>

I think the main problem here is not even caching, but rather 17
syscalls per page in the worst case, with the worst case being an
uncompressed tablespace.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

It also breaks O_DIRECT reads for some filesystem. For example, it is known to fail for XFS created with -s size=4096.

Revision history for this message
Stewart Smith (stewart) wrote :

Closing as Won't Fix for the old 1.6 release. If a fix is needed for 1.6, please reopen here or contact us (Percona) to arrange it.

If this problem is seen in 2.0 or later, we can open a new bug for performance issues.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-1141

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.