Jython scripting: UnicodeEncodeError with OCR --- not a bug, it's a feature

Bug #1891848 reported by Michael Böhm on 2020-08-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Sikuli
Undecided
RaiMan

Bug Description

Hi!

I am using Sikulix 2.0.4 on Windows 10 64bit, JAVA 11.

Sometimes when doing OCR with Reg.text(), following error occurs with the resulting string:

[error] UnicodeEncodeError ( 'ascii' codec can't encode character u'\u201a' in position 0: ordinal not in range(128) )
[error] --- Traceback --- error source first

my workaround is to use this function to correct the string:

def ExtractAlphanumeric(InputString):
    from string import ascii_letters, digits
    return "".join([ch for ch in InputString if ch in (ascii_letters + digits +" ()*")])

I still have the feeling, that the error should not occur in first place.
REgards
Michael

RaiMan (raimund-hocke) wrote :

The OCR/text feature indeed returns a unicode string.

This does not make problems with String operations (as you can see with your solution).

The only known problem comes up with the print statement, wich throws this error message, if the string contains non-ascii.

There is a SikuliX uprint() function instead, that can be used in such cases. It accepts comma separated parameters

Changed in sikuli:
status: New → Confirmed
milestone: none → 2.0.5
assignee: nobody → RaiMan (raimund-hocke)
RaiMan (raimund-hocke) on 2020-08-25
summary: - UnicodeEncodeError with OCR
+ Jython scripting: UnicodeEncodeError with OCR --- not a bug, it's a
+ feature
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers