Including a folder ending in "/" does not include folder contents

Bug #1624725 reported by Aaron Whitehouse on 2016-09-17
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Duplicity
High
Unassigned

Bug Description

Including a folder with an --include ending in "/" does not include folder contents. Including the folder with the same --include without the "/" does include contents.

As reported by Claus in Bug #1479545:
"I had a command line similar to
  --include ~/very/important/stuff/here/needs/backup/ --exclude '**'
Now of course I checked the backups when I set this up and everything was fine.

Recently I noticed that backups are very fast (too fast for the amount of data) and found that all files are missing in the backup!

By changing behavior back and forth you probably messed up a lot of backups everywhere and many users are probably not aware that their files are currently not in the backup!

My suggestion on how to cleanup this mess:
a) Change back once again (It is better there are to many files in backup than to little).
b) Introduce an explicit switch e.g. "--include-dirs-only <your-dir>" for the old behavior

At least point out this old/new very unintuitive behavior in the man page please.

Sorry if this sounds a little harsh, I am not writing this to yell at people who are only trying to help others. But I am very concerned about users/admins having a data loss and then noticing their backups are useless."

This bug was introduced at rev 1112 of the 0.7-series, though note that include and exclude lines including any globbing patterns (including *) were pretty fundamentally broken prior to rev 1110 (Bug #932482) anyway, so the main people who will be seeing a regression will be those who do not use globs in the relevant directory include/exclude line.

Related branches

description: updated
description: updated
Changed in duplicity:
assignee: nobody → Aaron Whitehouse (aaron-whitehouse)
importance: Undecided → High
status: New → Confirmed
description: updated

This is a challenge.

In many ways duplicity is doing exactly what it is being asked to do. If a user wants a folder and everything in it, they could "--include folder/**". That said, in the example given, I completely see that a user doing "--include folder/" would expect duplicity to back up the folder and its contents, so that should ideally be the behaviour.

The slash matching folders only is useful, too, as somebody doing:
--include fo*/
would expect it to match "folder/" but not "foo.txt", so I'm not keen to reverse this change back out to how things were before (completely ignoring trailing slashes).

I therefore think that the correct solution to this would be for a matched include folder to include all of that folder's contents unless trumped by a higher-priority exclude. This is in essence how excludes already work for most cases, as the file traverser does not descend into excluded folders. It also seems consistent with Ken's comments in this email:
http://lists.nongnu.org/archive/html/duplicity-talk/2015-03/msg00020.html

The necessary consequence of that may be some of the more niche things, e.g.:
--include /var/log/**/
--exclude /var/log/**
including only the directory skeleton but no files would cease to work. On balance I think that is the lesser of two evils and I did mention in Bug #1479545 that I was not making the change to solve the directory skeleton use case. We can try to address these use cases from 0.8 onwards with specific flags (such as that suggested by Claus) or an equivalent of the "ignorecase:" prefix (so that it works in include-filelists).

I would appreciate any thoughts.

summary: - Including a folder ending in "/" does not include contents
+ Including a folder ending in "/" does not include folder contents

In case it wasn't clear, my proposed solution would make Claus's example command work as he expected.

claus (claus2) wrote :

Hi Aaron,

I like your summary of the problem and your quick response, thanks :)
I think duplicity should have as little "hidden features" as possible and pretty much everything should be set through explicit command line options.

I'm wondering if we can fix this by treating a glob finishing with "/" as functionally equivalent to "/**". I haven't had time to test this idea through, but it feels like the essence of what we are proposing.

Changed in duplicity:
status: Confirmed → In Progress

I think that I have fixed this in the attached branch, but I would like to test it further before proposing a merge.

Thanks Aaron!

On Sun, Nov 6, 2016 at 5:03 PM, Aaron Whitehouse <<email address hidden>
> wrote:

> I think that I have fixed this in the attached branch, but I would like
> to test it further before proposing a merge.
>
> --
> You received this bug notification because you are subscribed to
> Duplicity.
> https://bugs.launchpad.net/bugs/1624725
>
> Title:
> Including a folder ending in "/" does not include folder contents
>
> Status in Duplicity:
> In Progress
>
> Bug description:
> Including a folder with an --include ending in "/" does not include
> folder contents. Including the folder with the same --include without
> the "/" does include contents.
>
> As reported by Claus in Bug #1479545:
> "I had a command line similar to
> --include ~/very/important/stuff/here/needs/backup/ --exclude '**'
> Now of course I checked the backups when I set this up and everything
> was fine.
>
> Recently I noticed that backups are very fast (too fast for the amount
> of data) and found that all files are missing in the backup!
>
> By changing behavior back and forth you probably messed up a lot of
> backups everywhere and many users are probably not aware that their
> files are currently not in the backup!
>
> My suggestion on how to cleanup this mess:
> a) Change back once again (It is better there are to many files in
> backup than to little).
> b) Introduce an explicit switch e.g. "--include-dirs-only <your-dir>"
> for the old behavior
>
> At least point out this old/new very unintuitive behavior in the man
> page please.
>
> Sorry if this sounds a little harsh, I am not writing this to yell at
> people who are only trying to help others. But I am very concerned
> about users/admins having a data loss and then noticing their backups
> are useless."
>
> This bug was introduced at rev 1112 of the 0.7-series, though note
> that include and exclude lines including any globbing patterns
> (including *) were pretty fundamentally broken prior to rev 1110 (Bug
> #932482) anyway, so the main people who will be seeing a regression
> will be those who do not use globs in the relevant directory
> include/exclude line.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/duplicity/+bug/1624725/+subscriptions
>

Changed in duplicity:
assignee: Aaron Whitehouse (aaron-whitehouse) → nobody
milestone: none → 0.7.11
status: In Progress → Fix Committed
Changed in duplicity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers