Error when inputting UTF8 CJK characters

Bug #264587 reported by ngU khO
2
Affects Status Importance Assigned to Milestone
IPython
Confirmed
Undecided
Unassigned

Bug Description

I'm using a UTF8 locale on my system and it's been working well for years. But when I try to input CJK characters into the IPython console in gnome-terminal via scim(an input method platform), some of the characters may go wrong in the console.

An example is the CJK character '选'('\u9009'), it should be encoded as '\xe9\x80\x89' in UTF8. However, trying to type this character into IPython is always a failure. The character will be display as several spaces followed by two question marks surrounded by a diamond(the � character):

In [1]: s = raw_input()
    �� (I was actually inputting '选', however this character could be displayed correctly)

And the read string is not the one I input(the first byte in original string becomes several spaces)

In [3]: s
Out[3]: ' \x80\x89'

Moreover, making an assignment to such characters may cause IPython to exit:

In [1]: s = ' ��'
WARNING:
********
You or a %run:ed script called sys.stdin.close() or sys.stdout.close()!
Exiting IPython!

Such things do not happen in the original python console(/usr/bin/python). And it should not be a problem of scim since the same thing happens when I paste the character from the clipboard instead of typing.

Attached is the screenshot of problem.

Tags: unicode
Revision history for this message
ngU khO (ngu-kho) wrote :
Revision history for this message
ngU khO (ngu-kho) wrote :

Sorry I missed a 'not' in the line after 'In[1]'. The sentence in the brackets should be:
(I was actually inputting '选', however this character could *NOT* be displayed correctly)

Changed in ipython:
status: New → Confirmed
tags: added: unicode
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.