Slow backups of large files, blocksize to small ?

Bug #897423 reported by sander eikelenboom
This bug affects 11 people
Affects Status Importance Assigned to Milestone
Fix Released

Bug Description

I'm using duplicity to backup large files, but backup seem to last forever.

In seems this could be related to the blocksize used, it also seems the blocksize calculated doesn't match the comment

In file

def get_block_size(file_len):
    Return a reasonable block size to use on files of length file_len

    If the block size is too big, deltas will be bigger than is
    necessary. If the block size is too small, making deltas and
    patching can take a really long time.
    if file_len < 1024000:
        return 512 # set minimum of 512 bytes
        # Split file into about 2000 pieces, rounding to 512
        file_blocksize = long((file_len / (2000 * 512)) * 512)
        return min(file_blocksize, 2048L)

For files larger than 2000*2048 it won't split it into 2000 pieces, but use a max of 2048 as blocksize.
Shouldn't this be:
return max(file_blocksize, 2048L)


Revision history for this message
sander eikelenboom (b-linux) wrote :

Hmm with a little more thought max() isn't very smart, there should a upperlimit.
So two alternatives:
- Perhaps bump the 2048 to say 4GB / 2000 bytes ? (that would handle dvd iso's)
- Make max_blocksize configurable, depending on the wish of better delta size or delta generation speed

Revision history for this message
Yuri D'Elia (wavexx) wrote :

I also incurred into this problem. Very large files become a bottleneck to the point that performing a backup becomes basicly impossible if there's any file larger than a couple of GB.

As suggested, I made the following patch, introducing --max-blocksize for both duplicity and rdiffdir (defaulting to 2048 like before).

I can now saturate my server I/O by using --max-blocksize 16777216, I don't mind the increased delta.
It's only lacking proper documentation.

Changed in duplicity:
importance: Undecided → Medium
milestone: none → 0.6.22
status: New → Fix Committed
Changed in duplicity:
status: Fix Committed → Fix Released
Revision history for this message
Remy van Elst (raymii) wrote :

I've emailed the mailing list about the documentation but did not receive a response ( Can someone please explain a bit more clear what this does or update the manpage?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers