SikuliX

Bug #710586
Activity log

Activity log for bug #710586

Date	Who	What changed	Old value	New value	Message
2011-01-31 10:57:47	RaiMan	bug			added bug
2011-02-02 14:29:14	RaiMan	description	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language.	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-02-02 14:29:38	RaiMan	sikuli: status	New	In Progress
2011-02-25 07:55:18	RaiMan	summary	X 1.0rc1: Region.text() -- known problems and needed improvements	X 1.0rc2: Region.text() -- known problems and needed improvements
2011-02-25 07:55:32	RaiMan	sikuli: importance	Undecided	High
2011-04-06 07:19:37	RaiMan	description	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-05-05 10:19:12	RaiMan	description	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-05-16 06:26:59	RaiMan	description	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-05-16 06:27:18	RaiMan	description	******* this report is a summary of known problems The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	******* this report is a summary of known problems and feature requests The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-05-20 02:07:27	Landy	bug			added subscriber Landy
2011-06-10 06:17:03	RaiMan	description	******* this report is a summary of known problems and feature requests The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	******* this report is a summary of known problems and feature requests The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others bug 795391: [request] OCR/tesseract: allow new training sets for other languages and more tesseract features Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-07-15 14:52:51	bevan dequeker	bug			added subscriber bevan dequeker
2011-07-29 20:03:27	Erek0se	bug			added subscriber Erek0se
2011-08-12 20:49:16	oganer@gmail.com	bug			added subscriber oganer@gmail.com
2011-08-25 18:54:35	Jeff Sant	bug			added subscriber Jeff Sant
2011-09-15 07:30:03	RaiMan	summary	X 1.0rc2: Region.text() -- known problems and needed improvements	X 1.0rc3: Region.text() -- known problems and needed improvements
2011-09-27 10:06:50	RaiMan	sikuli: milestone		x1.0
2011-10-05 06:24:08	RaiMan	description	******* this report is a summary of known problems and feature requests The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others bug 795391: [request] OCR/tesseract: allow new training sets for other languages and more tesseract features Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.	***** this report is a summary of known problems and feature requests * recent status information after release of rc3 see comment #6 The text recognition feature (OCR - Region.text()) together with the possibility to find text in an image is still experimental and under developement. This are currently reported bugs: bug 777660: text recognition errors with some fonts bug 783082: [request] want font parameters for text recognition bug 735434: Text extraction from Images fails in some cases on colored backgrounds bug 695616: Inconsistency in text recognition and matching, especially with integers-as-text! bug 695650: find(text).text() does not return same text bug 701005: text() always returns text with trailing x'200A20' bug 701012: text() does not return all intervening blanks, add's others bug 795391: [request] OCR/tesseract: allow new training sets for other languages and more tesseract features Other experienced oddities -- there are problems with text, that is not in english language -- very small and very large fonts may not work -- multiline text makes problems -- intervening/preceding/trailing grafics and symbols are tried to be interpreted as text Tip when using Region.text(): Currently you get the best results, when the region represents only one line of text and only contains text (no graphics/symbols) in english language. If you can influence it: make the text as large as possible. -- additional information: Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) is used. So their restrictions apply (e.g. minimum size of font, ...). Information can be found on their Wiki.
2011-11-25 11:49:34	RaiMan	bug			added subscriber yogesh joshi
2011-11-25 11:50:23	RaiMan	bug			added subscriber Sebastien Pinel
2012-01-05 07:22:26	Pattama w.	bug			added subscriber Pattama w.
2012-01-18 10:40:27	cheravuth	bug			added subscriber cheravuth
2012-08-02 05:25:07	Mark Weisler	bug			added subscriber Mark Weisler
2012-08-08 07:28:48	Sumit Bisht	bug			added subscriber Sumit Bisht
2012-11-02 11:13:16	RaiMan	sikuli: milestone	x1.0
2012-11-02 11:13:26	RaiMan	sikuli: assignee		RaiMan (raimund-hocke)
2012-11-02 15:13:33	RaiMan	tags		ocr
2013-01-24 18:47:22	cen liu	bug			added subscriber cen liu
2013-02-21 12:13:45	RaiMan	tags	ocr	fkt-text
2013-02-21 14:39:51	RaiMan	sikuli: importance	High	Low
2013-05-06 06:31:24	Bunnings	bug			added subscriber Bunnings
2014-04-15 09:17:24	amr lotfy	bug			added subscriber amr lotfy
2018-03-18 09:02:15	Alexander Pangilinan	bug			added subscriber Alexander Pangilinan