Bug #1050509 “Duplicity doesn't handle non-utf8 filenames well” : Bugs : duplicity package : Ubuntu

Revision history for this message

Milan Bouchet-Valat (nalimilan) wrote on 2012-09-13:

#1

Much thanks for taking care of this regression here and in the other bug! I still think this bug deserves a fix, or at least a better logging of the problematic file. If for some reason a user ends up with a file with an invalid name, there's no way of finding out this is the problem from the GUI, let alone identify the file and fix its name. So backup is impossible.

Revision history for this message

Roman Yepishev (rye) wrote on 2012-09-18:

#2

Coming from bug 989496:

Using Ubuntu One backend, the remote filenames are delivered from backend.list() are in unicode (json module decodes the utf8 strings into unicode object). Therefore when copy_to_local(fn) tries to log the data using log.Notice(_("Copying %s to local cache.") % fn) the latter error message crashes duplicity. So not only local filesystem encoding should be considered but also the backend output.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2012-09-18:

#3

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in duplicity (Ubuntu):
status:	New → Confirmed

Revision history for this message

François Marier (fmarier) wrote on 2012-10-21:

#4

This is the patch I currently apply every time there's a duplicity upgrade to work around the broken debug statements and carry on with the rest of my backup.

It's certainly not ideal an I'm not suggesting it be accepted upstream, but it may be useful to other users until that bug is fixed.

Revision history for this message

Ubuntu Foundations Team Bug Bot (crichton) wrote on 2012-10-21:

#5

The attachment "Hack to work-around the broken debug statements" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags:

added: patch

Revision history for this message

Vv (vivien-perez) wrote on 2012-10-23:

#6

Hello François,

I would like to use your patch to be able to resume my backups.

Could you tell me how to do that without breaking my system? Is it enough to replace the lines with a minus sign at their begining with the corresponding one with a plus sign in the /usr/lib/python2.7/dist-packages/duplicity/collections.py file?

Thanks for your help,

Vv

Revision history for this message

François Marier (fmarier) wrote on 2012-10-24:

#7

Hack around the UTF-8 problems in debug messages Edit (933 bytes, text/plain)

Here's a version of my patch without the unnecessary print statements.

Again, I'm not pretending to solve the problem, but it may help others who are waiting for the official fix.

Revision history for this message

François Marier (fmarier) wrote on 2012-10-24:

#8

Vv: You are correct. To apply the patch, you can simply remove the lines with minuses and replace them with the lines that start with a plus.

You can also use the "patch -p1 < filename.patch" command, but given there's only 3 lines to touch, it might be easier to do it by hand.

Revision history for this message

Vv (vivien-perez) wrote on 2012-10-24:

#9

Hello François,

thanks for your help. I did remove/add the specified lines.

The previous error is no more, but there is a new one :

Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1404, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1397, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1273, in main
    sync_archive(decrypt)
  File "/usr/bin/duplicity", line 1077, in sync_archive
    + "\n" + "\n".join(local_missing))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 44: ordinal not in range(128)

It seems related with the previous one (which was "UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 23: ordinal not in range(128)").

Does anyone has an idea about the origin of the problem?

Thanks in advance,

Vv

Revision history for this message

otto06217 (otto-kesselgulasch) wrote on 2012-10-26:

#10

Hi, folks,

what a bug!

If I use some other backup space like dropbox folder or gdrive (insync) I got no such error message. It seems to me a bug in Ubuntu One.

Thanks for help.

BTW: Some of my files were created on Windows.

Revision history for this message

Pilot6 (hanipouspilot) wrote on 2012-11-13:

#11

Priority must not be low. I am unable to upgrade just because of this bug. It affects not onlu non-utf systems, but all systems, where some files were created in Windows.

Revision history for this message

Alexandr Makovksy (mailboxmak) wrote on 2012-11-13:

#12

Hello, i have yhis one.
Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1404, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1397, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1248, in main
    action = commandline.ProcessCommandLine(sys.argv[1:])
  File "/usr/lib/python2.7/dist-packages/duplicity/commandline.py", line 994, in ProcessCommandLine
    globals.backend = backend.get_backend(args[0])
  File "/usr/lib/python2.7/dist-packages/duplicity/backend.py", line 161, in get_backend
    return _backends[pu.scheme](pu)
  File "/usr/lib/python2.7/dist-packages/duplicity/backends/u1backend.py", line 74, in __init__
    self.create_volume()
  File "/usr/lib/python2.7/dist-packages/duplicity/backend.py", line 328, in iterate
    return fn(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/duplicity/backends/u1backend.py", line 161, in create_volume
    answer = auth.request(self.volume_uri, http_method="PUT")
  File "/usr/lib/python2.7/dist-packages/ubuntuone-couch/ubuntuone/couch/auth.py", line 152, in request
    url, method=http_method, headers=headers, body=request_body)
  File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1543, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1293, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1229, in _conn_request
    conn.connect()
  File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 980, in connect
    sock.connect((self.host, self.port))
  File "/usr/lib/python2.7/dist-packages/httplib2/socks.py", line 424, in connect
    self.__negotiatehttp(destpair[0], destpair[1])
  File "/usr/lib/python2.7/dist-packages/httplib2/socks.py", line 374, in __negotiatehttp
    resp = self.recv(1)
timeout: timed out

Revision history for this message

Vv (vivien-perez) wrote on 2012-11-14:

#13

Hi guys,

following my non resolved problem (reported on bug 989496 that has been recently closed), I have tried to remove unicode characters from the filenames of the photos that I try to backup.

I have checked with convmv (doing a " convmv -r -f utf8 -t ascii ./*" in the backed up directory) and got a confirmation that no non-ascii character was still present.

I still got the same error message from duplicity :
Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1403, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1396, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1272, in main
    sync_archive(decrypt)
  File "/usr/bin/duplicity", line 1076, in sync_archive
    + "\n" + "\n".join(local_missing))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 44: ordinal not in range(128)

Any help would be very appreciated, I have not been able to backup my files for quite some time now...

Is there at least a way to know which file causes the problem?

Thanks,

Vv

Revision history for this message

Pilot6 (hanipouspilot) wrote on 2012-11-15:

#14

On a test system I created a folder containing two files:
1. Libreoffice file, created in Ubuntu
2. MS Word file created in Windows
Both foles have Russian names.

I tryed to backup this folder using Deja Dup to UbuntuOne and got same error.

Then I removed the file from Windows and tryed again, but still failed.
Now I can't backup even an empty folder.

Revision history for this message

Pilot6 (hanipouspilot) wrote on 2012-11-15:

#15

I need to add that there is no such bug in precise.

Revision history for this message

Ibanez (ibanez) wrote on 2013-01-14:

#16

I've a workaround,

You can change the session language before call to duplicity

declare -x LANG="en_US.UTF-8"

It work for me, my default LANG is "es_ES.UTF-8", and duplicity fails. With "en_US.UTF-8" works.

Revision history for this message

François Marier (fmarier) wrote on 2013-01-23:

#17

I can confirm that the work-around in comment 16 does work, although I had to add this to my backup script:

  export LANG=en_US.utf8
  export LANGUAGE=
  export LC_CTYPE="en_US.utf8"
  export LC_NUMERIC="en_US.utf8"
  export LC_TIME="en_US.utf8"
  export LC_COLLATE="en_US.utf8"
  export LC_MONETARY="en_US.utf8"
  export LC_MESSAGES="en_US.utf8"
  export LC_PAPER=en_US.UTF-8
  export LC_NAME="en_US.utf8"
  export LC_ADDRESS="en_US.utf8"
  export LC_TELEPHONE="en_US.utf8"
  export LC_MEASUREMENT="en_US.utf8"
  export LC_IDENTIFICATION="en_US.utf8"
  export LC_ALL=

Another locale that can be used to reproduce the problem is fr_CA.utf8.

So this bug has in fact nothing to do with filenames and everything to do with the localized error messages breaking duplicity.

Revision history for this message

Pierre (pierre-fr34) wrote on 2013-01-26:

#18

Adding only:

export LANG=en_US.utf8

in my backup script works for me. Thanks for the trick.

NB:
I am on precise: duplicity 0.6.18, python 2.7, LANG=fr_FR.UTF-8
This bug does not show when using duplicity on lucid (duplicity 0.6.08b python 2.6, LANG=fr_FR.UTF-8)

Revision history for this message

Vv (vivien-perez) wrote on 2013-02-07:

#19

Hello,

what exactly do you call your backup script? I launch duplicity with deja-dup from the unity menu, and I don't see how to specify the locale this way.

Thanks for your help,

Vv

Revision history for this message

François Marier (fmarier) wrote on 2013-02-08:

#20

Vv: my backup script is a shell script that wraps around duplicity. It's roughly what can be found in /usr/share/doc/duplicity/examples/system-backup.gz

Revision history for this message

Vv (vivien-perez) wrote on 2013-02-09:

#21

Hello François,

thanks for your help. I don't have any system-backup.gz neither "examples" folder in /usr/share/doc/duplicity/.

So I changed the language of my session (from fr_FR.UTF-8 to fr_FR.UTF-8) in the system preferences, and now backups are working again.

Cheers,

Vv

Revision history for this message

Coeur Noir (coeur-noir) wrote on 2014-05-26:

#22

Hello,

Not sure if this is the same problem :

https://bugs.launchpad.net/ubuntu/+source/duplicity/+bug/1286845/comments/14

line 130, in copy_file
log.Info(_("Writing %s") % target.get_parse_name())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)

Running Ubuntu 14.04 recently upgraded from 13.10 where duplicity/déjà-dup worked smoothly.

Revision history for this message

Michael Terry (mterry) wrote on 2017-09-06:

#23

To be clear, this bug is about filenames that are NOT valid utf8. Most user errors in bugs and comments here are about filenames that are utf8 -- but not ascii -- and duplicity having problems with that. But this bug is for those filenames that are truly bizarre.

That said, the fix for both is similar. Ever since adding gettext support, we've used utility functions in util.py to convert between byte and unicode strings. Those functions pass the 'replace' option to decode/encode while they're at it, which gracefully handles non-utf8 characters. As we fix normal utf8 conversion errors, by using those utility functions we also make the non-utf8 cases better.

So where are we today? We've fixed a bunch of UnicodeDecodeErrors throughout duplicity [1]. I don't think we've fixed 100% of them, but I do think we've hit the majority of the use cases by now.

This generic bug might not be super useful anymore. It might be better to close this? And keep using separate bugs for each specific instance of a decode error.

[1] https://bugs.launchpad.net/duplicity/+bugs?field.searchtext=ordinal+not+in+range&orderby=-status&search=Search&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=FIXRELEASED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE

Ubuntu
duplicity package

Duplicity doesn't handle non-utf8 filenames well

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Ubuntuduplicity package

Duplicity doesn't handle non-utf8 filenames well

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Ubuntu
duplicity package