crash with certain tif inputs: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.

Bug #912648 reported by jimav
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tesseract (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Summary:
wget -O test.tif https://bugs.launchpad.net/ubuntu/+source/tesseract/+bug/912648/+attachment/2659608/+files/test.tif && tesseract test.tif testout

Expected results: Run to completion. Actual results: Aborts with an assertion error.

--------------------------------

tesseract consistently crashes with the following assertion error:

tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
Aborted

...when passed certain files generated by ocrfeeder. Attached is a sample file captured from an ocrfeeder run.

To reproduce, run tesseract <attached sample tif file> outputfilename

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: tesseract-ocr 2.04-2.1ubuntu1
ProcVersionSignature: Ubuntu 3.0.0-14.23-generic 3.0.9
Uname: Linux 3.0.0-14-generic x86_64
NonfreeKernelModules: fglrx
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
Date: Thu Jan 5 22:32:11 2012
InstallationMedia: Xubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: tesseract
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
jimav (james-avera) wrote :
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in tesseract (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.