adding iso-8859-1 paths on a UTF8 file system gives a harsh error message

Bug #186005 reported by Robert Collins
2
Affects Status Importance Assigned to Milestone
Bazaar
New
Undecided
Unassigned

Bug Description

Playing around with bzr to version stuff from the debian archive we
found a nasty traceback:

  File "/home/robertc/source/baz/bzr-test-fixes/bzrlib/builtins.py",
line 384, in run
    no_recurse, action=action, save=not dry_run)
  File "/home/robertc/source/baz/bzr-test-fixes/bzrlib/mutabletree.py",
line 51, in tree_write_locked
    return unbound(self, *args, **kwargs)
  File "/home/robertc/source/baz/bzr-test-fixes/bzrlib/mutabletree.py",
line 400, in smart_add
    for subf in sorted(os.listdir(abspath)):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 0:
ordinal not in range(128)

What is happening here is:
(Pdb) abspath
u'/home/robertc/source/baz/unicode-decode-errors/t/aspell-is-0.51-0/'
(Pdb) os.listdir(abspath)
['\xedslenska.alias', u'.bzr']
(Pdb) sorted(os.listdir(abspath))
*** UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position
0: ordinal not in range(128)

That is: the path that is in iso8859-1 encoding on disk is not upcast to
unicode by os.listdir (as it fails to decode with utf8), so sorted()
when it compares the two strings tries to upcast it to unicode again,
which fails, and we get this lovely exception.

We should handle this more cleanly in some manner. (Preferably telling
the user they have a non-$filesystem-encoding file name on disk so they
can correct it and run again).

-Rob

 affects bzr

--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
James Westby (james-w) wrote :

Hi,

This appears to be a duplicate of bug 77657, so I have marked it as such.

Thanks,

James

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.