Strange include behavior

Bug #1768223 reported by Miroslav Sulc
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
Aaron Whitehouse

Bug Description

Duplicity (0.7.17, Python 2.7.14) seems to strip the source backup directory from supplied include pattern and then test the remaining part with respect to the source directory. However, this seems to lead to "surprising" results. For example, let's assume that one wants to backup:

   src/
   src/README
   src/c/
   src/c/README

and include only src/README. This can be conveniently achieved with

   duplicity --include 'src/README' --exclude '**' src file:///path/to/backup

However, if the backup command contains a typo such as (i.e., 'srcc' instead of 'src')

   duplicity --include 'srcc/README' --exclude '**' src file:///path/to/backup

then duplicity strips 'src' from the beginning of 'srcc/README' and backs up the file 'c/README'. Is this the desired behavior?

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

No, it should not work this way. It should only match against full directory node names.

Changed in duplicity:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → 0.7.18
Revision history for this message
Miroslav Sulc (msulc) wrote :

I suspect that in this specific case, it is because of the call

    index = tuple(filter(lambda x: x,
        filename[len(self.prefix):].split("/")))

in the method glob_get_filename_sf in selection.py since this chops off the root prefix just on the basis of its length without taking into account the file/directory name boundaries? In this way, srcc/README reduces to (prefix src) c/README which produces the false positive on the file c/README.

Changed in duplicity:
assignee: nobody → Aaron Whitehouse (aaron-whitehouse)
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Thanks Miroslav.

Can you please try the 0.8 series and see if this works for you? I overhauled the selection/glob matching code and I know it fixed some of these issues:
https://code.launchpad.net/~aaron-whitehouse/duplicity/08-merge-glob-parsers

Revision history for this message
Miroslav Sulc (msulc) wrote :

Thanks Aaron, I just checked this version and it seems to work, since now

    duplicity --no-encryption --include srcc/README --exclude '**' src file:///path

detects the incompatibility of srcc/README with the prefix src and raises

    FilePrefixError: srcc/README glob with src path gives None

Changed in duplicity:
milestone: 0.7.18 → 0.7.19
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

@kenneth, if this works in the 0.8-series, I am not inclined to backport it to the 0.7-series. I do not think that we should be making invasive changes like glob matching to the 0.7-series.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

@aaron, I'll agree. Moved target to 0.8.00.

Changed in duplicity:
milestone: 0.7.19 → 0.8.00
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

The actual bug here is Fix Committed, but I'll keep this as Confirmed for now until I have a chance to check the partial directory name matching point.

Changed in duplicity:
status: Confirmed → Fix Committed
status: Fix Committed → Confirmed
Changed in duplicity:
milestone: 0.8.00 → 0.8.01
Changed in duplicity:
milestone: 0.8.01 → 0.8.02
Changed in duplicity:
milestone: 0.8.02 → none
Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

I just tested the partial directory matching point on 0.8.13 and this appears to be fixed:
duplicity --include 'srcc/README' --exclude '**' src file://dest
Fatal Error: The file specification
    srcc/README
cannot match any files in the base directory
    src
Useful file specifications begin with the base directory or some
pattern (such as '**') which matches the base directory.

Please let me know if anyone can still reproduce.

Changed in duplicity:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.