SikuliX

Jython scripting: UnicodeEncodeError with OCR --- not a bug, it's a feature

Bug #1891848 reported by Michael Böhm on 2020-08-17

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	SikuliX	Confirmed	Undecided	RaiMan	SikuliX 2.0.5 "SikuliX"

Bug Description

Hi!

I am using Sikulix 2.0.4 on Windows 10 64bit, JAVA 11.

Sometimes when doing OCR with Reg.text(), following error occurs with the resulting string:

[error] UnicodeEncodeError ( 'ascii' codec can't encode character u'\u201a' in position 0: ordinal not in range(128) )
[error] --- Traceback --- error source first

my workaround is to use this function to correct the string:

def ExtractAlphanumeric(InputString):
from string import ascii_letters, digits
return "".join([ch for ch in InputString if ch in (ascii_letters + digits +" ()*")])

I still have the feeling, that the error should not occur in first place.
REgards
Michael

Revision history for this message

RaiMan (raimund-hocke) wrote on 2020-08-25:

The OCR/text feature indeed returns a unicode string.

This does not make problems with String operations (as you can see with your solution).

The only known problem comes up with the print statement, wich throws this error message, if the string contains non-ascii.

There is a SikuliX uprint() function instead, that can be used in such cases. It accepts comma separated parameters

Changed in sikuli:
status:	New → Confirmed
milestone:	none → 2.0.5
assignee:	nobody → RaiMan (raimund-hocke)

RaiMan (raimund-hocke) on 2020-08-25

summary:

- UnicodeEncodeError with OCR
+ Jython scripting: UnicodeEncodeError with OCR --- not a bug, it's a
+ feature

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.