B2 provider cannot handle two backups in the same bucket

Bug #1657916 reported by Michael Bisbjerg
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
Unassigned

Bug Description

# duplicity --version
duplicity 0.7.06

# python --version
Python 2.7.12

# OS
Ubuntu 16.04 LTS

# Target
B2 bucket, I intended to use the same bucket for different backup jobs

# Log
I've attached the log. Note that it detects a remote state, and tries to download it - but immediately fails with a 404. I've investigated quite a bit as to where it might get that remote state (since this is the first ever run), and I think I finally nailed it. The filename it's downloading is an exact match for another job I have (note the different "folders" in B2) - so I suspect that duplicity finds the first file that starts with duplicity and uses that, without checking the full filename. (Given I know that the B2 API doesn't see them as folders).

The current files on the B2 Bucket can be seen in this image, http://imgur.com/a/C3OKi, which is a screenshot from Cyberduck. It shows no files in an "scm" folder, but a few (one with the exact same name as requested) files in the other job I have, named "configs".

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :
Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

Sidenote: Buckets are cheap, so I can workaround it by creating a new bucket (duplicity does this for me) for each job I want.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote : Re: [Bug 1657916] Re: B2 provider cannot handle two backups in the same bucket

Duplicity is the one that does not allow multiples sources to the same
archive. You will need to run one backup to one directory.

On Thu, Jan 19, 2017 at 5:50 PM, Michael Bisbjerg <
<email address hidden>> wrote:

> Sidenote: Buckets are cheap, so I can workaround it by creating a new
> bucket (duplicity does this for me) for each job I want.
>
> --
> You received this bug notification because you are subscribed to
> Duplicity.
> https://bugs.launchpad.net/bugs/1657916
>
> Title:
> B2 provider cannot handle two backups in the same bucket
>
> Status in Duplicity:
> New
>
> Bug description:
> # duplicity --version
> duplicity 0.7.06
>
> # python --version
> Python 2.7.12
>
> # OS
> Ubuntu 16.04 LTS
>
> # Target
> B2 bucket, I intended to use the same bucket for different backup jobs
>
> # Log
> I've attached the log. Note that it detects a remote state, and tries to
> download it - but immediately fails with a 404. I've investigated quite a
> bit as to where it might get that remote state (since this is the first
> ever run), and I think I finally nailed it. The filename it's downloading
> is an exact match for another job I have (note the different "folders" in
> B2) - so I suspect that duplicity finds the first file that starts with
> duplicity and uses that, without checking the full filename. (Given I know
> that the B2 API doesn't see them as folders).
>
> The current files on the B2 Bucket can be seen in this image,
> http://imgur.com/a/C3OKi, which is a screenshot from Cyberduck. It
> shows no files in an "scm" folder, but a few (one with the exact same
> name as requested) files in the other job I have, named "configs".
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/duplicity/+bug/1657916/+subscriptions
>

Changed in duplicity:
status: New → Invalid
Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

In my example, I have two backups, withe following destinations:

b2://---SNIP---:---SNIP---@mbwarez-backup/scm
b2://---SNIP---:---SNIP---@mbwarez-backup/configs

I made these from the duplicity documentation, which says this for the b2 format:
b2://account_id[:application_key]@bucket_name/[folder/]

.. looking at it now, I may have missed the trailing forward slash on the folder path - could that have done it?

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

I just tried now with trailing backslashes - but I get the same error. Running one full backup to:
b2://---SNIP---:---SNIP---@mbwarez-backup/configs/

.. prevents me from running another full backup to:
b2://---SNIP---:---SNIP---@mbwarez-backup/scm/

The observed behaviour is the same as before. In the SCM case, it will try to download the state from the remote, but fail with an HTTP 404. The filename it tries to download is again the filename that exists in the "configs" folder.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

Try running the 'scm' backup with the -v9 option and pasting the log as an attachment to this bug report. Munge whatever data you need to, but keep the structure the same.

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

It should be the same as the original log, but for completeness sake I've run it again. For reference, the commands to reproduce are these (I've removed the bucket entirely prior to this test):

PASSPHRASE=$PASSPHRASE duplicity /mnt/systems/configs b2://$B2ACCOUNTID:$B2APPKEY@mbwarez-backup/configs/

PASSPHRASE=$PASSPHRASE duplicity /mnt/systems/backup/source/scm/ b2://$B2ACCOUNTID:$B2APPKEY@mbwarez-backup/scm/ -v9

The second command is the one that fails.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

Can you verify that B2 is separating the backups into the configs/ and scm/ directories?

Changed in duplicity:
status: Invalid → Confirmed
Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

I just performed a new test to verify. It does not create the second folder in the B2 bucket.

I ran this:
PASSPHRASE=$PASSPHRASE duplicity /root/ b2://$B2ACCOUNTID:$B2APPKEY@mbwarez-test-bucket/folderA

And then this:
PASSPHRASE=$PASSPHRASE duplicity /root/ b2://$B2ACCOUNTID:$B2APPKEY@mbwarez-test-bucket/folderB

To be sure, I then ran this, to rule out issues with the same source folder:
PASSPHRASE=$PASSPHRASE duplicity /home/mike/ b2://$B2ACCOUNTID:$B2APPKEY@mbwarez-test-bucket/folderB

The last two gave the 404 not found, and mentioned this file:
Copying duplicity-full-signatures.20170205T151503Z.sigtar.gpg to local cache.
Attempt 1 failed. HTTPError: HTTP Error 404: Not Found

And the above filename I can find in the bucket, but in folderA. See the attached screenshot for a file listing /after/ all the above commands.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

It sounds like B2 does not handle folders correctly.

Have you thought about trying --file-prefix=FolderA instead of a real
folder? That would change the naming to FolderA_duplicity_... and make the
file naming different for each folder.

On Sun, Feb 5, 2017 at 9:19 AM, Michael Bisbjerg <<email address hidden>
> wrote:

> I just performed a new test to verify. It does not create the second
> folder in the B2 bucket.
>
> I ran this:
> PASSPHRASE=$PASSPHRASE duplicity /root/ b2://$B2ACCOUNTID:$B2APPKEY@
> mbwarez-test-bucket/folderA
>
> And then this:
> PASSPHRASE=$PASSPHRASE duplicity /root/ b2://$B2ACCOUNTID:$B2APPKEY@
> mbwarez-test-bucket/folderB
>
> To be sure, I then ran this, to rule out issues with the same source
> folder:
> PASSPHRASE=$PASSPHRASE duplicity /home/mike/ b2://$B2ACCOUNTID:$B2APPKEY@
> mbwarez-test-bucket/folderB
>
> The last two gave the 404 not found, and mentioned this file:
> Copying duplicity-full-signatures.20170205T151503Z.sigtar.gpg to local
> cache.
> Attempt 1 failed. HTTPError: HTTP Error 404: Not Found
>
> And the above filename I can find in the bucket, but in folderA. See the
> attached screenshot for a file listing /after/ all the above commands.
>
> ** Attachment added: "Capture.JPG"
> https://bugs.launchpad.net/duplicity/+bug/1657916/+
> attachment/4813582/+files/Capture.JPG
>
> --
> You received this bug notification because you are subscribed to
> Duplicity.
> https://bugs.launchpad.net/bugs/1657916
>
> Title:
> B2 provider cannot handle two backups in the same bucket
>
> Status in Duplicity:
> Confirmed
>
> Bug description:
> # duplicity --version
> duplicity 0.7.06
>
> # python --version
> Python 2.7.12
>
> # OS
> Ubuntu 16.04 LTS
>
> # Target
> B2 bucket, I intended to use the same bucket for different backup jobs
>
> # Log
> I've attached the log. Note that it detects a remote state, and tries to
> download it - but immediately fails with a 404. I've investigated quite a
> bit as to where it might get that remote state (since this is the first
> ever run), and I think I finally nailed it. The filename it's downloading
> is an exact match for another job I have (note the different "folders" in
> B2) - so I suspect that duplicity finds the first file that starts with
> duplicity and uses that, without checking the full filename. (Given I know
> that the B2 API doesn't see them as folders).
>
> The current files on the B2 Bucket can be seen in this image,
> http://imgur.com/a/C3OKi, which is a screenshot from Cyberduck. It
> shows no files in an "scm" folder, but a few (one with the exact same
> name as requested) files in the other job I have, named "configs".
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/duplicity/+bug/1657916/+subscriptions
>

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

I could imagine that you're missing a filter/prefix on the B2 call (without having checked the source at all). The B2 api call for listing files is this:

https://www.backblaze.com/b2/docs/b2_list_file_names.html

On it, it seems that with the correct combination of delimiter and prefix, you can list just the contents of a specific folder.

This part is relevant: "With a prefix of "photos/", and a delimiter of "/", you would get:"

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

Having now checked the source for the b2backend, I can see that the two calls to b2_list_file_names do not contains a filter. And from what I can gather, there is no client-side filtering of the returned list of files.

One method in particular, '_list', seems to just cut down the returned list of filepaths to simply a list of filenames. So any caller would be unable to determine the containing folder for a given file ... is this correct?

Revision history for this message
Michael Bisbjerg (michael-mbwarez) wrote :

Apologies - just noticed the "if os.path.dirname(x['fileName']) == self.path]" part.. It's getting late.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

See my answer #10. It looks like this version of b2backend does not handle folders at all. I don't have access to b2 and I don't have time right now to add support for folders, especially since duplicity can do much the same with --file-prefix.

Changed in duplicity:
status: Confirmed → New
importance: Undecided → Wishlist
Revision history for this message
Daniel Harvey (daniel.harvey) wrote :

I've had this problem just recently. This issue can be resolved by adding the 'prefix' parameter to the list files call.

Patch attached for consideration.

Revision history for this message
Daniel Harvey (daniel.harvey) wrote :

Patch attached this time :)

Changed in duplicity:
importance: Wishlist → Medium
milestone: none → 0.7.12
status: New → In Progress
Changed in duplicity:
status: In Progress → Fix Committed
Changed in duplicity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Patches

Remote bug watches

Bug watches keep track of this bug in other bug trackers.