feature request: parallel rsync for sst

Bug #1167331 reported by Mrten on 2013-04-10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Alex Yurchenko
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Released
Raghavendra D Prabhu

Bug Description

If an SST is needed, the manual states that rsync is the fastest way to get all the data.

However, I noticed that there is only one rsync running for the whole SST, which easily gets CPU-bound on my hardware (IBM x3550 with hardware raid). When this happens, sysstat shows IO as only 10% busy.

Could the rsync be parallelized in some way that is useful for speeding up the SST?

Perhaps like this for three rsyncs at a time, for just the databases:

find /var/lib/mysql -type d -print0 | xargs -P 3 -0 rsync [rsync-options]

and a separate rsync for the files in /var/lib/mysql with --no-recurse

Should speed things up a bit...

Mrten (bugzilla-ii) wrote :

A first stab at a patch. This makes my SST at least four times faster, uses my disks at 80% instead of 10, but I'm not quite sure it is fully OK.

Alex Yurchenko (ayurchen) wrote :

That's some powerful find kung-fu! So do I get it right that you don't need any changes on the joiner side?

Mrten (bugzilla-ii) wrote :

The joiner side is an rsync daemon which automatically accepts >1 connections, so no, nothing to be done on the joiner side.

The patch is not working correctly BTW, needs a slash after the second rsync. I'll post an update later.

Mrten (bugzilla-ii) wrote :

This one's better; gets me a 202G db copied, from scratch, in 20-30 minutes. Fixes a dud error I get all the time with 'permission denied on /dev/stderr' too.

I haven't tested the error-handling of the 'parallel rsync' portion.

Good to see parallel rsync. However, the other SST method
Xtrabackup - already supports parallel copying and parallel
compression among other things. However, currently I see the
options to it are hardcoded in the SST script. I would add (in a separate bug) an
option so that it sources a config file if present for the

Report #5 as lp:1168361

Alex Yurchenko (ayurchen) wrote :

Mrten, can you give Codership a non-exclusive copyright to this patch, so that we can incorporate and maintain it in our code?

Mrten (bugzilla-ii) wrote :

#5: Please modify as you see fit. I used rsync as I didn't get innobackupex to work (yet). Does innobackupex use multiple cores?

#7: I can release my patch as public domain, is that good enough for you? If so, consider it done.

Alex Yurchenko (ayurchen) wrote :

Thanks! We'll try to incorporate it by the next release.

Changed in percona-xtradb-cluster:
milestone: none → 5.5.30-24.8
status: New → Triaged
Changed in codership-mysql:
assignee: nobody → Alex Yurchenko (ayurchen)
importance: Undecided → Medium
milestone: none → 5.5.30-24.8
status: New → Confirmed

@Mrten, yes, xtrabackup which is called by innobackupex, uses multiple cores if --parallel is used.

Changed in percona-xtradb-cluster:
assignee: nobody → Raghavendra D Prabhu (raghavendra-prabhu)
Changed in percona-xtradb-cluster:
status: Triaged → Fix Committed
Changed in codership-mysql:
milestone: 5.5.30-24.8 → 5.5.31-23.7.4
Changed in percona-xtradb-cluster:
status: Fix Committed → Fix Released
Changed in codership-mysql:
status: Confirmed → Fix Released

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1331

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers