crash with certain tif inputs: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tesseract (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Summary:
wget -O test.tif https:/
Expected results: Run to completion. Actual results: Aborts with an assertion error.
-------
tesseract consistently crashes with the following assertion error:
tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET:
Aborted
...when passed certain files generated by ocrfeeder. Attached is a sample file captured from an ocrfeeder run.
To reproduce, run tesseract <attached sample tif file> outputfilename
ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: tesseract-ocr 2.04-2.1ubuntu1
ProcVersionSign
Uname: Linux 3.0.0-14-generic x86_64
NonfreeKernelMo
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
Date: Thu Jan 5 22:32:11 2012
InstallationMedia: Xubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
ProcEnviron:
PATH=(custom, user)
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: tesseract
UpgradeStatus: No upgrade log present (probably fresh install)
Status changed to 'Confirmed' because the bug affects multiple users.