Bazaar

Parameter is unsupported by the current encoding

Bug #243424 reported by Russ Brown on 2008-06-27

Affects		Status	Importance	Assigned to	Milestone
	Bazaar	Confirmed	Medium	Unassigned

Bug Description

In trying to fix bug #56680 for the diffstat plugin, I need to bzr add the file attaches which contains unicode characters in the filename.

Unfortunately, I'm stumbling with bzr add crashing out with the following backtrace:

Traceback (most recent call last):
  File "/usr/bin/bzr", line 102, in <module>
    exit_val = bzrlib.commands.main(sys.argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 836, in main
    "encoding." % a))
bzrlib.errors.BzrError: Parameter ''Lekt\xfcre'' is unsupported by the current encoding.

It's entirely possible that I just need to tweak something in my setup in order to get this to work, or it might be a bug. Either way, I'll appreciate some help on it. :)

I'm running bash on Ubuntu Hardy. A couple more snippets of information that I was asked to produce in IRC by Verterok:

rbrown@dorian t $ python -c "import sys; print sys.getdefaultencoding()"
ascii
rbrown@dorian t $ python -c "import sys; print sys.getfilesystemencoding()"
UTF-8

Tags:

Revision history for this message

John A Meinel (jameinel) wrote on 2008-06-27:

When your encoding is ASCII you can't pass in a non-ascii character, because we have no way of understanding how to interpret the character.

Generally you can just do:

LANG="en_US.UTF-8"
export LANG

And that will set "sys.getdefaultencoding()" to UTF-8. (Note use can use your preferred language instead of en_US, the important thing is .UTF-8).

Note, that we don't use sys.getdefaultencoding() for command line arguments. We use:

import locale
print locale.getpreferredencoding()

The function is "bzrlib.osutils.get_user_encoding()".

Revision history for this message

Russ Brown (pickscrape) wrote on 2008-06-27:

Interestingly, my LANG is already set to en_US.UTF-8, so I'm not sure
that is the problem...

Revision history for this message

John A Meinel (jameinel) wrote on 2008-06-27:

Can you test:

python -c "import bzrlib.osutils; print bzrlib.osutils.get_user_encoding()"

And also:

python -c "import locale; print locale.getpreferredencoding()"

They should be the same.

Looking at your error: ''Lekt\xfcre''

\xfc is not valid utf-8, it looks more like latin-1.

So is the file actually in your filesystem encoding? It certainly looks like it isn't.

Revision history for this message

Russ Brown (pickscrape) wrote on 2008-06-27:

$ python -c "import bzrlib.osutils; print bzrlib.osutils.get_user_encoding()"
UTF-8

$ python -c "import locale; print locale.getpreferredencoding()"
UTF-8

Regarding whether the file is in my filesystem encoding: I have no idea. I'm embarrassed to say that I know very little about how encodings and locales work given that I'm an English speaker and (so far) have only ever developed for other English speakers.

As I say, I only need to do this in order to reproduce and fix bug #56680, which is where the file in question comes from (attached to that bug). If it is the case that bzr is behaving properly here and I just need to tweak something locally in order to be able to do what I want to do, I'm fine with that but would appreciate some advice on what it is I need to tweak. :)

Thanks.

Revision history for this message

John A Meinel (jameinel) wrote on 2008-06-27:

well, in latin-1 '\xfc' == ü
So I'm guessing the filename in question is:
Lektüre

So what you need to do is rename the 'latin-1' form 'Lekt\xfcre' to the 'utf-8' form: 'Lekt\xc3\xbcre'

You could probably do that with:

python -c "import os; os.rename('Lekt\xfcre', 'Lekt\xc3\xbcre')"

At that point, it should be in the right filesystem encoding so that we can understand the filename correctly.

So, out of curiousity, how did you create this file in the first place? Did it come from somewhere else?

Revision history for this message

Russ Brown (pickscrape) wrote on 2008-06-27:

Thanks for the quick reply John.

I didn't create the file at all. As I said I got it from the zip attachment (3rd comment down) to bug #56680 which was posted to help reproduce that bug. In trying to use it to reproduce that bug, I encountered this problem. :)

The python line you gave did indeed rename the file and allowed bzr to add it. Trouble is I now can't get diffstat to complain about it. :)

Presumably the reporter had no problem using bzr add on the file, and only encountered a problem when using diffstat on it. At this point I'm starting to wonder if it's going to be possible for me to solve this without having my filesystem in the same encoding as the reporter's...

Jelmer Vernooij (jelmer) on 2009-06-19

tags:	added: encoding
Changed in bzr:
importance:	Undecided → Medium
status:	New → Triaged

Martin Pool (mbp) on 2010-03-18

Changed in bzr:
status:	Triaged → Confirmed

Jelmer Vernooij (jelmer) on 2017-11-08

tags:

added: check-for-breezy

Jelmer Vernooij (jelmer) on 2019-06-08

tags:

removed: check-for-breezy

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.