Add support for long-term storage with restore delays

Bug #1197958 reported by Julien Fastré
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Duplicity
New
Undecided
Unassigned

Bug Description

Hi,

A lot of service providers are offering long-term storage service. This kind of storage is very cheap, but restoring a file force a delay which may last a couple of hours. I am thinking about Amazon Glacier, but also OVH "Personal Cloud Archive" (http://www.ovh.com/fr/cloud/archives/)

My clients are very interested by using this kind of backup. I would like to use a script to make incremental and encrypted backup, and I am thinking about adapting Duplicity for this.

I searched within bug reports, but except bug 1039511, I did not see any report of a discussion on the implementation of a backend adapted to those kind of backup.

I never contributed to Duplicity, and I had a glance at the code, and it seems to me quite easy to understand.

I see those problem:

1. Duplicity need to have access to the "manifest" files (for restoring from another machine, or for checking if the cache need a sync before upload). I was thinking about separating those manifest files from the data files, adding an option --manifest-site(or something like that)=<URL AND BACKEND FOR MANIFEST>

For instance: duplicity --manifest-file=s3+http//my_bucket/ /home/me sftp://<email address hidden>/some_dir

If the "--manifest-site" option is detected, the manifests files would be send/retrieved to/from this place (s3+http//my_bucket/ in my example) instead of the usual place (sftp://<email address hidden>/some_dir in my example). But I could not see in the code where this should be changed.

2. For restoring the file from such "glacier", we should take into account the delay of restoring file. I was thinking about adapting both the script and the backend class.

In the backend class, I was thinking about adding a function "hasDelay". The return value would be a boolean, false by default. If the response is false, the script would continue as usually.

If the response is true, then duplicity would execute the function "prepareFileForGet" with filename as an option. After asking the backend class to prepare the file, a loop would be executed every 2 minutes and to ask the backend whether the requested files are ready for dowload (backend.isReadyForDownload(filename)). The result would be a boolean.

If a file is ready, this file would be downloaded, and the operation repeated until every files would be ready.

Do you think this feature would be useful ?

Do you expect other problems or collisions ?

I would be happy with comments and improvements...

Regards,
Julien Fastré

Tags: amazon glacier
Revision history for this message
Thierry B. (thierrybo2) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.