unicode errors in sigtar2path_iter on filenames with non ascii characters

Bug #1797797 reported by Bas Hulsken
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
Unassigned

Bug Description

when encountering files with non ascii (latin-1 special chars such as u umlaut, etc.) duplicity breaks with the following error:

--- Start running command BKP at 20:51:28.814 ---
Traceback (innermost last):
  File "/usr/bin/duplicity", line 1668, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1654, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1501, in main
    do_backup(action)
  File "/usr/bin/duplicity", line 1636, in do_backup
    incremental_backup(sig_chain)
  File "/usr/bin/duplicity", line 658, in incremental_backup
    globals.backend)
  File "/usr/bin/duplicity", line 419, in write_multivol
    globals.volsize)
  File "/usr/lib64/python2.7/site-packages/duplicity/gpg.py", line 386, in GPGWriteFile
    data = block_iter.next().data
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 523, in next
    result = self.process(next(self.input_iter))
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 195, in get_delta_iter
    for new_path, sig_path in collated:
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 286, in collate2iters
    relem2 = next(riter2)
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 354, in combine_path_iters
    refresh_triple_list(triple_list)
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 341, in refresh_triple_list
    new_triple = get_triple(old_triple[1])
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 327, in get_triple
    path = next(path_iter_list[iter_index])
  File "/usr/lib64/python2.7/site-packages/duplicity/diffdir.py", line 249, in sigtar2path_iter
    index = tuple(name.split(u"/"))
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 54: ordinal not in range(128)

20:53:06.906 Task 'BKP' failed with exit code '30'.
--- Finished state FAILED 'code 30' at 20:53:06.906 - Runtime 00:01:38.091 ---

the filename it chokes on is this one:
/var/www/html/lustband/03 - Lustband - Wat zou je doen_ (Bløf).mp3

problem seems unicode split to tokenize a string that was treated as a bytestring before.

patch below fixes this for me, and seems in line with the intent of the code, but I haven't spend a lot of time understanding the (unicode) filename handling in duplicity, so I might be wrong.

some version info:
Start duply v2.1, time is 2018-10-14 20:51:28.
Using profile '/etc/duply/dirvish-steady-var'.
Using installed duplicity version 0.8.00, python 2.7.15 (/usr/bin/python2), gpg 1.4.22 (Home: ~/.gnupg), awk 'GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)', grep 'grep (GNU grep) 3.1', bash '4.4.12(1)-release (x86_64-redhat-linux-gnu)'.

best regards,
Bas Hulsken

diff -Naupr duplicity/duplicity/diffdir.py duplicity-fix/duplicity/diffdir.py
--- duplicity/duplicity/diffdir.py 2018-10-06 15:26:27.358130000 +0200
+++ duplicity-fix/duplicity/diffdir.py 2018-10-14 20:34:53.830534389 +0200
@@ -246,7 +246,7 @@ def sigtar2path_iter(sigtarobj):
         else:
             raise DiffDirException(u"Bad tarinfo name %s" % (tiname,))

- index = tuple(name.split(u"/"))
+ index = tuple(name.split(b"/"))
         if not index[-1]:
             index = index[:-1] # deal with trailing /, ""

Revision history for this message
Bas Hulsken (bhulsken) wrote :
Changed in duplicity:
assignee: nobody → Kenneth Loafman (kenneth-loafman)
importance: Undecided → Medium
milestone: none → 0.8.00
status: New → In Progress
Changed in duplicity:
status: In Progress → Fix Committed
assignee: Kenneth Loafman (kenneth-loafman) → nobody
summary: - unicode errors on filenames with non ascii characters
+ unicode errors in sigtar2path_iter on filenames with non ascii
+ characters
Changed in duplicity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.