File selection misses files with certain include directives

Bug #1692807 reported by destrius
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Duplicity
Invalid
Undecided
Unassigned

Bug Description

When an include option is specified that contains wildcards, a bug in the file selection code causes other files to end up not being included in the backup.

In globmatch.py, function select_fn_from_glob(), in the definition of test_fn():

    def test_fn(path):
        # ...
        if glob_comp_re.match(path.name):
            # ...
        elif include == 1 and scan_comp_re.match(path.name):
            return 2
        else:
            return None

The code here returns 2 if the option is an include option, and the scan_comp_re match succeeds. The function should return 2 only if 'path' is a directory that might contain a file that will be included by 'glob_str'. However, in the code above, there is no check that path is a directory. As a result, there can be cases where 'path' is a file but matches 'scan_comp_re', which results in test_fn() returning 2 for such files. This results in those files never getting included in the file list.

This bug can be demonstrated with the following directory layout:

$ find /tmp/foo/test
/tmp/foo/test
/tmp/foo/test/yes
/tmp/foo/test/yes/metoo.jpg
/tmp/foo/test/yes/yup.txt
/tmp/foo/test/no
/tmp/foo/test/no/noway.txt
/tmp/foo/test/no/includeme.jpg

If we run duplicity as follows:

$ duplicity --no-encryption --include "/tmp/foo/test/**.jpg" --exclude "/tmp/foo/test/no" /tmp/foo/test/ file:///tmp/foo/test_backup/

And then restore:

$ duplicity --no-encryption file:///tmp/foo/test_backup test_restore

We get the following directory layout in test_restore:

$ find /tmp/foo/test_restore/
/tmp/foo/test_restore/
/tmp/foo/test_restore/yes
/tmp/foo/test_restore/yes/metoo.jpg
/tmp/foo/test_restore/no
/tmp/foo/test_restore/no/includeme.jpg

As you can see, the file /tmp/foo/test/yes/yup.txt was not included in the backup, even though we would expect it to be included given the rules.

The excerpt from the v9 output of the backup command shows what's happening:

Selection: examining path /tmp/foo/test/yes/yup.txt
Selection: result: 2 from function: Command-line include glob: /tmp/foo/test/**.jpg
Selection: result: None from function: Command-line exclude glob: /tmp/foo/test/no
Selection: ? scanning directory for matches

The fix is simple, just add a path.isdir() before the scan_comp_re match:

--- globmatch.py.orig 2017-05-23 14:30:44.740001477 +0800
+++ globmatch.py 2017-05-23 14:31:12.692680159 +0800
@@ -120,7 +120,7 @@
                     # An included file/folder, so normal approach is fine
                     return include

- elif include == 1 and scan_comp_re.match(path.name):
+ elif include == 1 and path.isdir() and scan_comp_re.match(path.name):
             return 2
         else:
             return None

Thanks,
Xinyu

Revision history for this message
destrius (destrius) wrote :
Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

This has been fixed already. Please upgrade to the current version of duplicity. This will assure that any bugs fixed since your release are available and may fix your issue.

There are three options:

* Release tarball Install - https://launchpad.net/duplicity/+download
* Daily duplicity builds - https://launchpad.net/~duplicity-team/+archive/ubuntu/daily
* Stable duplicity builds - https://launchpad.net/~duplicity-team/+archive/ubuntu/ppa

Changed in duplicity:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.