[master] bzr can't handle files with filenames that can't be decoded to unicode

Bug #63324 reported by Mildred
72
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned
Breezy
Triaged
Medium
Unassigned
bzr (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Bazaar treats file names as Unicode, not as byte strings. If you have files in the working tree whose name is not valid in the filesystem encoding, they cause problems.

----

I try to use bazaar to track changes in a wine directory. i have problems with some folder name when I try to add it.

$ bzr status
added:
  drive_c/windows/profiles/mildred/Mes vidéos@
$ bzr ci -m 'try to add Mes vidéos'
bzr: ERROR: exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 61: ordinal not in range(128)

Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line 612, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line 574, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line 283, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.4/site-packages/bzrlib/builtins.py", line 1757, in run
    reporter=reporter)
  File "/usr/lib/python2.4/site-packages/bzrlib/decorators.py", line 51, in write_locked
    return unbound(self, *args, **kwargs)
  File "/usr/lib/python2.4/site-packages/bzrlib/workingtree.py", line 538, in commit
    committed_id = Commit().commit( working_tree=self, revprops=revprops,
  File "/usr/lib/python2.4/site-packages/bzrlib/commit.py", line 287, in commit
    self._populate_new_inv()
  File "/usr/lib/python2.4/site-packages/bzrlib/commit.py", line 519, in _populate_new_inv
    path, self.work_tree)
  File "/usr/lib/python2.4/site-packages/bzrlib/repository.py", line 2110, in record_entry_contents
    ie.snapshot(self._new_revision_id, path, previous_entries, tree, self)
  File "/usr/lib/python2.4/site-packages/bzrlib/inventory.py", line 415, in snapshot
    self._read_tree_state(path, work_tree)
  File "/usr/lib/python2.4/site-packages/bzrlib/inventory.py", line 798, in _read_tree_state
    self.symlink_target = work_tree.get_symlink_target(self.file_id)
  File "/usr/lib/python2.4/site-packages/bzrlib/workingtree.py", line 714, in get_symlink_target
    return os.readlink(self.id2abspath(file_id))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 61: ordinal not in range(128)

bzr 0.10.0 on python 2.4.4.candidate.0 (linux2)
arguments: ['/usr/bin/X11/bzr', 'ci', '-m', 'try to add Mes vid\xc3\xa9os']

** please send this report to <email address hidden>
$ bzr --version
Bazaar (bzr) 0.10.0
Using python interpreter: /usr/bin/python
Using python standard library: /usr/lib/python2.4
Using bzrlib: /usr/lib/python2.4/site-packages/bzrlib

Copyright 2005, 2006 Canonical Development Ltd.
http://bazaar-vcs.org/

bzr comes with ABSOLUTELY NO WARRANTY. bzr is free software, and
you may use, modify and redistribute it under the terms of the GNU
General Public License version 2 or later.

Revision history for this message
Martin Pool (mbp) wrote :

Could you please tell me what the 'locale' command shows or what your $LANG environment variable is set to? I think you're set to an ASCII locale where these characters aren't supported.

With LANG=en_AU.UTF-8 this works for me.

Revision history for this message
Mildred (mildred) wrote :

Here is the result :

$ locale
LANG=fr_FR.UTF-8
LANGUAGE=fr_FR:en_US:en_GB:en
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=

Revision history for this message
John A Meinel (jameinel) wrote :

I believe the problem is that Cygwin only supports ASCII encoding. It doesn't understand non-ascii filenames.

So if you have non-ascii filenames, you need to use the native win32 bzr, not cygwin.

I think it would be useful if we had a better way of handling errors like this, so I'm not closing the bug just yet

Changed in bzr:
importance: Undecided → Medium
status: Unconfirmed → Confirmed
Revision history for this message
PresuntoRJ (fabio-tleitao) wrote :
Download full text (3.7 KiB)

I have a similar problem here... only I was trying to add to a newly initialized revision folder from an existing CVS checkout under Ubuntu

leitao@arm7:~/Documents/suporte$ bzr add
added CVS
added misc
added CVS/Entries
added CVS/Repository
added CVS/Root
added misc/CVS
added misc/docs
added misc/CVS/Entries
added misc/CVS/Repository
added misc/CVS/Root
added misc/docs/CVS
added misc/docs/build
added misc/docs/desenvolvimento
bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 17: ordinal not in range(128)

Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 834, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 790, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.5/site-packages/bzrlib/builtins.py", line 384, in run
    no_recurse, action=action, save=not dry_run)
  File "/usr/lib/python2.5/site-packages/bzrlib/mutabletree.py", line 51, in tree_write_locked
    return unbound(self, *args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/mutabletree.py", line 400, in smart_add
    for subf in sorted(os.listdir(abspath)):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 17: ordinal not in range(128)

bzr 1.2.0.candidate.1 on python 2.5.2.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'add']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'pt_BR.UTF-8'
plugins:
  atomlog /home/leitao/.bazaar/plugins/atomlog [unknown]
  automv /home/leitao/.bazaar/plugins/automv [unknown]
  builddeb /usr/lib/python2.5/site-packages/bzrlib/plugins/builddeb [0.92.0dev0]
  bzr_lastlog /home/leitao/.bazaar/plugins/bzr_lastlog [unknown]
  bzrtools /home/leitao/.bazaar/plugins/bzrtools [1.1.0]
  diffstat /home/leitao/.bazaar/plugins/diffstat [unknown]
  email /home/leitao/.bazaar/plugins/email [unknown]
  gtk /usr/lib/python2.5/site-packages/bzrlib/plugins/gtk [0.93.0]
  heads /home/leitao/.bazaar/plugins/heads [unknown]
  htmllog /home/leitao/.bazaar/plugins/htmllog [unknown]
  launchpad /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]
  lessdiff /home/leitao/.bazaar/plugins/lessdiff [unknown]
  lesslog /home/leitao/.bazaar/plugins/lesslog [unknown]
  rebase /usr/lib/python2.5/site-packages/bzrlib/plugins/rebase [0.3.0]
  stats /home/leitao/.bazaar/plugins/stats [unknown]
  svn /usr/lib/python2.5/site-packages/bzrlib/plugins/svn [0.4.7]
  trunk /home/leitao/.bazaar/plugins/trunk [0.91.0]
  vimdiff /home/leitao/.bazaar/plugins/vimdiff [unknown]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.

Here is my locale result:
leitao@arm7:~/Docume...

Read more...

Revision history for this message
Martin Pool (mbp) wrote :

Upstream bug

Changed in bzr:
status: New → Invalid
Martin Pool (mbp)
summary: - exceptions.UnicodeEncodeError: 'ascii' codec can't encode character
+ [master] bzr can't handle files with filenames that can't be decoded to
+ unicode
description: updated
tags: added: unicode
Martin Pool (mbp)
tags: added: duplicate-master
Jelmer Vernooij (jelmer)
Changed in brz:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.