[1.0.1] improvements for text recognition

Bug #1224811 reported by RaiMan
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
SikuliX
Fix Released
Medium
RaiMan

Bug Description

------------ small fonts
... more background info in the related question
Improvement in OCR recognition with small images (thanks to Jose Damian)

Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
---------------
if (in_img.rows < MIN_HEIGHT){
   scale = ceil(MIN_HEIGHT / float(in_img.rows));
   copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
---------------

This solution achieves near perfect recognition with my image samples.
I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

Tags: fkt-text
RaiMan (raimund-hocke)
description: updated
description: updated
description: updated
Changed in sikuli:
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → RaiMan (raimund-hocke)
milestone: none → 1.1.0
Revision history for this message
niknah (hankin0) wrote :

Tesseract works better if you enlarge the image.
I think it uses black and white for OCR, not even grayscale. So it gets confused if the gaps between the letters are gray and not black or white.

It'd be useful if a scale argument can be added to text() or Settings

Right now, I can't see a way to OCR on an enlarged image. text() only works on a Region. Can it be used with an Image?

Here're some tests with the attached image.
The same image, 2x, 3x, scaled using gimp (cubic) and the results from tesseract 3.02

------Results from tesseract 3.02----

mm lnruwzk hmvm raxmmvm D>l’V nu Wm dafl
um mequlci. hmwn fox lunllkduier u. my an;

u..n-umuuu hvrmn miuumu over uu ht} dug
::»,un. quick hmvm fax ilmlped u... nu lazy dug

um-1.. quick hmmu (ax jumped rner Ilue nu, dug
lfipt The quick brown fox jumped over the lazy dog

ispc The quick brown fox jumped over the lazy dog

32pt The quick brown fox jumped over the lazy dog

Same image resized 2x cubic with gimp...
lllpl Thequick hmwn foxjumped ovtrlhel-.I1_v dog

llpl'l1Ir quick bmwn [ox jumped over lhr lazy dog

12])! The quick hmwn foxjumpcd over the hazy dog
l3pt The quick brown fox jumped over the lazy dog

I-lpt The quick brown fox jumped over the lazy dog
l6pt The quick brown fox jumped over the lazy dog

l8pt The quick brown fox jumped over the lazy (log

32pt The quick brown fox jumped over the lazy dog

Same image resized 3x cubic with gimp...

llilpl The quick brmm foxjumpcd over the lazy dog
Hp! 11:‘ quick brown fox jumped over the Buy dog

I211! The qulek brown foxjumped over the lazy dog
l3pl The quick brown fox jumped over the lazy dog

l-lpt The quick brown fox jumped over the lazy dog

RaiMan (raimund-hocke)
Changed in sikuli:
milestone: 1.1.0 → 1.2.0
RaiMan (raimund-hocke)
tags: added: fkt-text
RaiMan (raimund-hocke)
Changed in sikuli:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.