bzr status can't handle unicode filenames in the same dir correct.

Bug #92608 reported by mafix
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
John A Meinel

Bug Description

== Description ==

There was a file called Begründung-Short.odt in the same dir..

this made it impossible to check the status also to commit it or everything else =(

same error already reported in: #56815, #43689 but not in the same case.

== Version Info ==

bzr --version:
Bazaar Version:
Bazaar (bzr) 0.15.0candidate2
Using python interpreter: /usr/bin/python
Using python standard library: /usr/lib/python2.5
Using bzrlib: /usr/lib/python2.5/site-packages/bzrlib
Using bazaar configuration: /home/felix/.bazaar

== Distro and Package ==
Ubuntu Feisty 7.07
bzr_0.15~rc2-0ubuntu1_all.deb

== Error ==

mafix@jep:~$ bzr status
modified:
  Lebenslauf.doc
  Lebenslauf.odt
unknown:
bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)

Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 650, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 612, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 304, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 622, in ignore_pipe
    result = func(*args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/builtins.py", line 201, in run
    to_file=self.outf, short=short, versioned=versioned)
  File "/usr/lib/python2.5/site-packages/bzrlib/status.py", line 165, in show_tree_status
    short_status=False)
  File "/usr/lib/python2.5/site-packages/bzrlib/delta.py", line 196, in show
    show_list(self.unversioned)
  File "/usr/lib/python2.5/site-packages/bzrlib/delta.py", line 128, in show_list
    print >>to_file, '%s %s' % (short_status_letter, path)
  File "codecs.py", line 303, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)

bzr 0.15.0candidate2 on python 2.5.0.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'status']

** please send this report to <email address hidden>

Revision history for this message
mafix (mafix) wrote :

It would be a merge error because it was already fixed in 0.9 ??

Revision history for this message
John A Meinel (jameinel) wrote :

This seems like a different bug. It would seem that something is passing a path around as a UTF8 string, rather than passing it as a Unicode string.

Can you do 'bzr info' on the working tree? I'm guessing you are using WorkingTreeFormat4 (aka dirstate).

And I'm guessing the _iter_changes code to return Unknowns wasn't updated to return Unicode paths.

I'll look quickly.

Revision history for this message
John A Meinel (jameinel) wrote :

this should be fixed before bzr-0.15.final

This seems to fix the problem, but we should have a _iter_changes test on unicode unknowns to make sure this doesn't regress.

Felix: Can you test and see if this fixes "bzr status" for you?

=== modified file 'bzrlib/workingtree_4.py'
--- bzrlib/workingtree_4.py 2007-03-13 13:38:34 +0000
+++ bzrlib/workingtree_4.py 2007-03-15 19:59:03 +0000
@@ -2033,9 +2033,10 @@
                 new_executable = bool(
                     stat.S_ISREG(root_dir_info[3].st_mode)
                     and stat.S_IEXEC & root_dir_info[3].st_mode)
- yield (None, (None, current_root), True, (False, False),
+ current_root_unicode = current_root.decode('utf8')
+ yield (None, (None, current_root_unicode), True, (False, False),
                     (None, None),
- (None, splitpath(current_root)[-1]),
+ (None, splitpath(current_root_unicode)[-1]),
                     (None, root_dir_info[2]), (None, new_executable))
             initial_key = (current_root, '', '')
             block_index, _ = state._find_block_index_from_key(initial_key)
@@ -2225,11 +2226,12 @@
                                     stat.S_ISREG(current_path_info[3].st_mode)
                                     and stat.S_IEXEC & current_path_info[3].st_mode)
                                 if want_unversioned:
- yield (None, (None, current_path_info[0]),
+ yield (None,
+ (None, utf8_decode(current_path_info[0])[0]),
                                         True,
                                         (False, False),
                                         (None, None),
- (None, current_path_info[1]),
+ (None, utf8_decode(current_path_info[1])[0]),
                                         (None, current_path_info[2]),
                                         (None, new_executable))
                             # dont descend into this unversioned path if it is

Changed in bzr:
assignee: nobody → jameinel
importance: Undecided → High
status: Unconfirmed → Confirmed
Revision history for this message
John A Meinel (jameinel) wrote :

Actually, it is worse than I though. There are more code paths that return path sections without decoding them. I'm writing up a fairly complete test for it. But it will take a bit more than just this patch.

Because this is a strict regression, I'm increasing the priority of this bug.

Changed in bzr:
importance: High → Critical
Revision history for this message
mafix (mafix) wrote :

sorry for the late answer.

 * bzr info does work without patches..
 * but i have a problem to apply the patch =(

== info error patch apply ==
apply dir: /usr/lib/python2.5/site-packages/bzrlib/

$ patch < bzr.patch
patching file workingtree_4.py
Hunk #1 FAILED at 2033.
Hunk #2 FAILED at 2226.
2 out of 2 hunks FAILED -- saving rejects to file workingtree_4.py.rej

i attached the reject file..

Revision history for this message
John A Meinel (jameinel) wrote :

Looking at the .rej file, it looks more like you didn't get the patch downloaded properly. Mostly my fault because I pasted it into a text field rather than into an attachment.

However, the fix is also available in the associated branch:
https://code.launchpad.net/~jameinel/bzr/iter_changes_unicode_unknowns

Everything should be working in that branch, though now I'm trying to clean up the fix a bit.

John A Meinel (jameinel)
Changed in bzr:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.