[1.1.0] Windows: print u'2byte character' (unicode, utf-8) does not work

Bug #1379722 reported by RaiMan
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
SikuliX
In Progress
High
RaiMan

Bug Description

********* more information see comment #1

------------------------------------------

This is the sample code about which I worry.
--------
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
reload(sys)
import codecs

sys.stdin = codecs.getreader('utf-8')(sys.stdin)
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)

print unicode("漢字","utf-8")
--------
"漢字" is 2byte character.
An error does not occur.
However, it is not displayed normally.

First, sikuli-IED is started.
First execution :
  漢字
  >>In the first execution, it is displayed normally.
Second execution :
  æ¼¢å­—
  >>In the second execution, a display is wrong.
Third execution :
  Ã¦Â¼Â¢Ã¥Â­Â—
  >>The third execution is wrong in a display further again.
Fourth execution :
  ÃƒÂ¦Ã‚¼Â¢Ã¥Â­Â—
Fifth execution :
  ÃƒÂƒÃ‚¦Ã‚¼Â¢Ã¥Â­Â—
Sixth execution :
  ÃƒÂƒÃ‚ƒÃ‚¦Ã‚¼Â¢Ã¥Â­Â—
7th execution :
  ÃƒÂƒÃ‚ƒÃ‚ƒÃ‚¦Ã‚¼Â¢Ã¥Â­Â—

It becomes strange rapidly.
May I know how to resolve this issue? Thanks!

Revision history for this message
RaiMan (raimund-hocke) wrote :

I have to confirm this problem, but only on Windows (tested with version 1.1.0 which contains Jython 2.7b2 and with Java 7 and 8).

- the usage of the codecs module leads to the mentioned weird behavior.
Seems to be some Jython problem when using the same interpreter instance again (as is the case in Sikuli IDE when rerunning a script in the same IDE session).

--- running the below test script on Windows:
- the popups show the expected output
- on commandline, in the unicode situation ? are printed for each unicode character using the Java println
- the simple Python print refuses to work with unicode characters with a decoding error

this is the test script I used:

import codecs
import java.lang.System as JS
#sys.stdin = codecs.getreader('utf-8')(sys.stdin)
#sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
text = "字"
textu = unicd(text) # a wrapper for unicode(text, "utf-8")
# instead one can use: text = u"字"
JS.out.println("plainJ: " + text)
JS.out.println("unicodeJ: " + textu)
popup("plain: " + text)
popup("unicode: " + textu)
print "plain:", text
print "unicode:"
print unicode(text,"utf-8")

--- getting this output on Mac
plainJ: 字
unicodeJ: 字
plain: 字
unicode:

with the popups showing the expected output.

Changed in sikuli:
status: New → In Progress
importance: Undecided → High
assignee: nobody → RaiMan (raimund-hocke)
milestone: none → 1.1.0
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.