Windows - Linux -- UnicodeDecodeError: 'utf8' codec can't decode byte 0x81 in position 13: unexpected code byte

Bug #363727 reported by Stefan Müller
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Low
Unassigned

Bug Description

I was trying to use Bazaar as a sync tool between windows and linux machines. Linux and windows run on the same machine in this case and the repository and branches were created initially from windows. I guess the following problem is somewhat unusual. Here are the steps to reproduce the trace at the end of this report.

1. Start Windows (I use XP SP2)
2. Create a Bazaar repository. Mine contains a huge number of PDF files (~ 4GB).
3. Linux is installed on the same machine. I have mounted the ext3 file system
   to have access to it when using windows. I created a branch and run update.
   The files initially committed in step 2 were imported. I used the menu in the
   windows explorer to achieve this.
4. I changed a file in the original repository on my ntfs partition.
5. I started Ubunutu 8.04
6. I ran 'bzr update' from the console in the repository --> Error

I guess Bazaar is not meant to be used for this purpose or at least I should create the branch form linux?
Anyway, I want to report here in case it is important.

--------------------------------------------------------------------------------

bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode byte 0x81 in position 13: unexpected code byte

Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 834, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 790, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 800, in ignore_pipe
    result = func(*args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/builtins.py", line 189, in run
    to_file=self.outf, short=short, versioned=versioned)
  File "/usr/lib/python2.5/site-packages/bzrlib/status.py", line 114, in show_tree_status
    want_unversioned=want_unversioned)
  File "/usr/lib/python2.5/site-packages/bzrlib/tree.py", line 93, in changes_from
    want_unversioned=want_unversioned,
  File "/usr/lib/python2.5/site-packages/bzrlib/decorators.py", line 127, in read_locked
    return unbound(self, *args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/tree.py", line 727, in compare
    want_unversioned=want_unversioned)
  File "/usr/lib/python2.5/site-packages/bzrlib/delta.py", line 217, in _compare_trees
    want_unversioned=want_unversioned):
  File "/usr/lib/python2.5/site-packages/bzrlib/workingtree_4.py", line 2418, in iter_changes
    (None, utf8_decode(current_path_info[0])[0]),
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x81 in position 13: unexpected code byte

bzr 1.3.1 on python 2.5.2.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'status']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'de_DE.UTF-8'
plugins:
  launchpad /usr/lib/python2.5/site-packages/bzrlib/plugins/launchpad [unknown]

Tags: win32
Revision history for this message
Stefan Müller (smu-blackbox) wrote :

Ok I had a look at the log file and it states that the repository was created with the default os encoding - which is cp1252 in my case (when running Windows). In Linux it is UTF-8 so I guess this is the fault. If this is correct this bug is invalid?
Sorry...

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 363727] Re: Windows - Linux -- UnicodeDecodeError: 'utf8' codec can't decode byte 0x81 in position 13: unexpected code byte

On Sun, 2009-04-19 at 17:03 +0000, Stefan Müller wrote:
> Ok I had a look at the log file and it states that the repository was created with the default os encoding - which is cp1252 in my case (when running Windows). In Linux it is UTF-8 so I guess this is the fault. If this is correct this bug is invalid?
> Sorry...

I would certainly expect issues sharing a working tree with two
different filename encodings - those issues should turn up outside just
bzr itself though ;).

You could try overriding your fs encoding on linux to be the same as it
is on windows, so that your file paths match.

-Rob

Vincent Ladeuil (vila)
Changed in bzr:
status: New → Confirmed
importance: Undecided → Low
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
tags: removed: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.