Successful but incomplete full backup

Bug #914504 reported by Alphazo
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Duplicity
New
Undecided
Unassigned

Bug Description

I started an initial full backup on 408GB of photos and videos using the configuration and duply version found below. Source and destination are found on the same local system so encryption was disabled. The only significant change I applied in the configuration was the volume size of 3.5GB rather than the default 25MB.
Initial backup went fine (but took a very long time) with no error reported. I then started a second backup of the same data set which has not been modified at all. The incremental backup took longer than expected and 86GB could be found in the incremental data set.
Listing (duply list) and comparing the files in the initial full backup and incremental one revealed that the full backup stopped in the middle of a directory toward the end. The incremental backup just finished what should have been done by the initial full backup.
For information I used the exact same configuration on a different data set that are not photos & videos (only 240GB) and the initial full backup went fine and was complete. Incremental backup was very fast with no data added.

Has anyone experience such incomplete backup? Could that be linked to the 3.5GB volume size?

GPG_KEY='disabled'
GPG_PW='_GPG_PASSWORD_'
TARGET='file:///mnt/user/BACKUPS/snapshots/photos-backup'
SOURCE='/mnt/user/PHOTOS'
MAX_AGE=6M
VOLSIZE=3500
DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "
TEMP_DIR=/mnt/user/duplicity-cache_tmp/tmp
ARCH_DIR=/mnt/user/duplicity-cache_tmp/.cache

duply 1.5.5.4
duplicity 0.6.17
python 2.6.4
gpg 1.4.10
awk 'GNU Awk 3.1.8'
bash '4.1.7(2)

Revision history for this message
Martin R. Siegert (martin-siegert) wrote :

I experience the same effect with a somewhat similar setup:

Initial full-backup of a 570GB directory-tree, (containing 100 subdirectories and 161906 files in total.) passed happily and created 564 one-GB volumes "*.difftar.gz" (with --no-encryption and --volsize 1024) plus a two-GB .sigtar.gz and a 161k .manifest file.

Then I reran duplicity with the same settings, and the incremential backup started to push ANOTHER 445 volumes, which (according to the duplicity-inc-*.manifest) started at the file that was part of vol 119 of the previous full backup.

I have a number of further cases of such high filecount backups, and observed the following:
The intermediate "duplicity-*.sigtar.part" files grow (in my case of a 2.8TB directory-tree) up to 25GB (!) in size, but the resulting .sigtar.gz files are NEVER greater than 2147497908 Bytes (=2GB + 14260 Bytes) and the "plateau" of .sigtar.gz filesizes starts at about 2GB minus 27kB.

I gunzip'ed my 2147485382 Bytes .sigtar.gz of the above 570GB backup and searched "strings -n 10" in the tail -5000 the resulting 2170225751 Bytes .sigtar file. Now I found my original files that were part of the vol118.difftar.gz to be contained last in the (TRUNCATED?) sigtar file.

My assumption for the kindly coring supporters of this wonderfull project are:
The compression of the (large) sigtar.part file into sigtar.gz files seems to stop at about 2GB filesize. Resulting in any file of the initial full backup that was listed after the 2GB position to be recognized as "new/changed" file in an incremental backup. Additionally the files that were backuped in the full backup but have their signatures stored behind the 2GB limit cannot be rsync-updated fior small changes and maybe (untested) not even restorred at al. I.e. these files are lost and only waste space in the full-backup set.
Maybe this issue could be associated with https://bugs.launchpad.net/duplicity/+bug/385495/comments/8 (Requesting large signatrues to also be split, because of possible file-size limitation on the backup-storage.)

BTW: I successfully gziP'ed manually a 25.57GB .sigtar.part into a 25.04GB .sigtar.part.gz - so the error cause is not the gzip executable (v1.3.5) or my filesystem (ZFS).

Technical details:
Solaris 10, duplicity 0.6.15, python 2.4

Revision history for this message
Alphazo (alphazo) wrote :

Hi Martin,

I think you hit the spot. My initial sigtar.gz is 2GB large and the incremental one is 679MB.

Is there any plan to fix this limitation?

Revision history for this message
Martin R. Siegert (martin-siegert) wrote :

Hi Alphazo,

up to now we've used 32bit python at Solaris-10 OS.
My colleague just compiled Python-64bit for Solaris which seems (*) to to better.

You haven't told your OS - if that is 32bit, you might have problems. (32bit-Applications can use "LARGEFILES" on compile-time but I fear the duplicity-way of using gzip's output isn't LARGEFILES-compliant. While if you use a 64bit-OS, then maybe ensuring to use 64bit python could be your solution - I would appreciate to hear from you on this.

*: Upload of these 570GB to our archival-server takes about 4 days, so definite results won't be available before next week.

Regards,
Martin

Revision history for this message
Alphazo (alphazo) wrote :

duplicity is running on an unRAID machine which is based upon Slackware 32-bit version. Unfortunately there is no 64-bit versions of unRAID out yet.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.