Performance considerations (use of duplicitys -v9 and progress estimation)

Bug #1562411 reported by Fjodor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Déjà Dup
Triaged
Wishlist
Unassigned

Bug Description

Hi there,

Via a few work-arounds, I have my work laptop take backups of most of it (as root) to an NFS share, and some time ago I noticed that it was actually quite a while ago since an incremental backup had succeeded within the time frame of a normal work day.

Naturally, I looked at the arguments that deja-dup asked duplicity to use, and since I have two rather large VMs, I went on to fix duplicitys handling of those, and saw a rather large speed-up when running duplicity with the arguments I had seen deja-dup providing to it (except for one, see below).

Turning back to let deja-dup handle the automation, I was much dismayed by the fact that while the speed-up of duplicity did make it through, it was still substantially slower than my direct duplicity runs. I may have missed or misinterpreted a few bits, but what I seem to have found follows.

1. Please correct me if I'm wrong, but it seems that deja-dup does a sort of dry-run first, presumably to be able to give an estimate of a percentage of progression.

2. It would seem that deja-dup runs duplicity with -v9.

Ad 1: Do look up the "Microsoft Minute", being the satirical term for between 1 second and infinity hours. I understand the need for presenting an estimated percentage of completion, but to be quite honest, that could be accomplished much faster by simply ascertaining the number of sub-directories down to a few levels, and then use the progression through them as a basis. It's not like the percentage indicator is very accurate as it stands now anyway. Also, in the case of backing up to local or gigabit-connected storage, this would almost be a 50% reduction in execution time...

Ad 2: I did my direct duplicity test-runs at -v6 instead of -v9 in order to actually be able to follow progress visually without being overwhelmed, but also still get more feedback than nothing. Now, I'm not suggesting that -v6 is perfect for all intents and purposes, but I am rather sure that -v9 is putting out so much data that it actually slows down duplicity to a non-trivial degree. Given my suggestion in re 1, I'm pretty sure that a combination of that, and -v6 (or lower - haven't tried) would both speed up execution substantially, and provide you with all the data you need to give a reasonable estimate.

I shall be happy to answer any and all questions, as well as trying out experimental builds that incorporate changes like the ones proposed, in order for you guys to decide if any changes you make are good or bad.

Best regards,

F

Revision history for this message
Fjodor (sune-molgaard) wrote :

Since https://bugs.launchpad.net/deja-dup/+bug/1556089 hasn't even been discussed yet, I have patched duplicity as per comment #4 in https://bugs.launchpad.net/duplicity/+bug/582962 and am now (probably also due to cleaning out 60-70Gb in my machine) in a situation where deja-dup backups usually complete within a working day.

It still takes several hours though, as opposed under one when running duplicity directly.

Needless to say, I find that situation rather less than optimal - any chance for considerations regarding this one?

Vej (vej)
Changed in deja-dup:
status: New → Triaged
importance: Undecided → Wishlist
summary: - Performance considerations
+ Performance considerations (use of duplicitys -v9 and progress
+ estimation)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.