innobackupex incremental backups duplicate files unnecessarily

Bug #1077984 reported by Jason Gill
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Triaged
Wishlist
Unassigned
2.0
Won't Fix
Undecided
Unassigned
2.1
Triaged
Wishlist
Unassigned
2.2
Triaged
Wishlist
Unassigned
2.3
Triaged
Wishlist
Unassigned

Bug Description

Our MySQL databases have tens of thousands of tables, so each of the directories in /var/lib/mysql/ contains a huge number of .frm files. Each database could contain a gigabyte or more of just rarely-changing .frm files. We use the --rsync option for innobackupex to help speed up the process of copying these files, but have found that space is basically wasted as each incremental backup contains multiple gigabytes of frm files despite the fact that they have not been modified since the last incremental backup.

As a simple solution, we have created the following patch which modifies the behavior of the --rsync option. When the --rsync option AND the --incremental-basedir option are specified, the incremental basedir is passed to rsync as --link-dest.

This tells rsync that any files which are unchanged when compared to the contents of the incremental-basedir should be hardlinked back to their original copies, not duplicated. Any changed files are still handled properly - they are rsync'ed in to the new location, but any unchanged files (like our thousands of .frm's) are hardlinked which saves disk space.

Using this patch we have reduced the time it takes to run an incremental backup, as well as the amount of disk space used from many gigabytes per incremental backup to just a few hundred megabytes.

Restore options are not impacted by this change - the hardlink files look and behave identically to "real" files, they simply save on disk space.

Revision history for this message
Jason Gill (jasongill) wrote :
tags: added: contribution
Revision history for this message
Alexey Kopytov (akopytov) wrote :

Jason,

Sounds like a nice feature. Thanks for the contribution! The exact release which we can merge it into depends on available bandwidth, but we can try to do this for 2.0.5.

Stewart Smith (stewart)
tags: added: innobackupex
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-87

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.