SikuliX

[1.1.0] Windows: print u'2byte character' (unicode, utf-8) does not work

Bug #1379722 reported by RaiMan on 2014-10-10

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	SikuliX	In Progress	High	RaiMan	SikuliX 1.1.0 "SikuliX"

Bug Description

********* more information see comment #1

------------------------------------------

This is the sample code about which I worry.
--------
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
reload(sys)
import codecs

sys.stdin = codecs.getreader('utf-8')(sys.stdin)
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)

print unicode("漢字","utf-8")
--------
"漢字" is 2byte character.
An error does not occur.
However, it is not displayed normally.

First, sikuli-IED is started.
First execution :
  漢字
  >>In the first execution, it is displayed normally.
Second execution :
  æ¼¢å
  >>In the second execution, a display is wrong.
Third execution :
  Ã¦Â¼Â¢Ã¥ÂÂ
  >>The third execution is wrong in a display further again.
Fourth execution :
  ÃÂ¦ÃÂ¼ÃÂ¢ÃÂ¥ÃÂÃÂ
Fifth execution :
  ÃÂÃÂ¦ÃÂÃÂ¼ÃÂÃÂ¢ÃÂÃÂ¥ÃÂÃÂÃÂÃÂ
Sixth execution :
  ÃÂÃÂÃÂÃÂ¦ÃÂÃÂÃÂÃÂ¼ÃÂÃÂÃÂÃÂ¢ÃÂÃÂÃÂÃÂ¥ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ
7th execution :
  ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ¦ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ¼ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ¢ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ¥ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ

It becomes strange rapidly.
May I know how to resolve this issue? Thanks!

See original description

Revision history for this message

RaiMan (raimund-hocke) wrote on 2014-10-10:

I have to confirm this problem, but only on Windows (tested with version 1.1.0 which contains Jython 2.7b2 and with Java 7 and 8).

- the usage of the codecs module leads to the mentioned weird behavior.
Seems to be some Jython problem when using the same interpreter instance again (as is the case in Sikuli IDE when rerunning a script in the same IDE session).

--- running the below test script on Windows:
- the popups show the expected output
- on commandline, in the unicode situation ? are printed for each unicode character using the Java println
- the simple Python print refuses to work with unicode characters with a decoding error

this is the test script I used:

import codecs
import java.lang.System as JS
#sys.stdin = codecs.getreader('utf-8')(sys.stdin)
#sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
text = "字"
textu = unicd(text) # a wrapper for unicode(text, "utf-8")
# instead one can use: text = u"字"
JS.out.println("plainJ: " + text)
JS.out.println("unicodeJ: " + textu)
popup("plain: " + text)
popup("unicode: " + textu)
print "plain:", text
print "unicode:"
print unicode(text,"utf-8")

--- getting this output on Mac
plainJ: å
unicodeJ: 字
plain: å
unicode:
字

with the popups showing the expected output.

Changed in sikuli:
status:	New → In Progress
importance:	Undecided → High
assignee:	nobody → RaiMan (raimund-hocke)
milestone:	none → 1.1.0
description:	updated

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

SikuliX

[1.1.0] Windows: print u'2byte character' (unicode, utf-8) does not work

Bug Description

Other bug subscribers

Related questions

Remote bug watches