tesseract assert failure: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.

Bug #565688 reported by Crashbit on 2010-04-18
182
This bug affects 34 people
Affects Status Importance Assigned to Milestone
Tesseract
Unknown
Unknown
tesseract (Ubuntu)
Undecided
Unassigned

Bug Description

Tesseract fails

ignasi@ignasi-desktop:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu lucid (development branch)
Release: 10.04
Codename: lucid
ignasi@ignasi-desktop:~$

ProblemType: Crash
DistroRelease: Ubuntu 10.04
Package: tesseract-ocr 2.04-2
ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
AssertionMessage: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
CheckboxSubmission: be855d426122c5a11956fef117ded5b1
CheckboxSystem: edda5d4f616ca792bf437989cb597002
CrashCounter: 1
Date: Sun Apr 18 02:36:10 2010
ExecutablePath: /usr/bin/tesseract
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta amd64 (20100318)
ProcCmdline: tesseract /tmp/UZBC9Lf9jw/jvkFElmSob.tif /tmp/UZBC9Lf9jw/BE5MYeneZ6 -l spa
ProcEnviron:
 LANG=ca_ES.utf8
 SHELL=/bin/bash
Signal: 6
SourcePackage: tesseract
StacktraceTop:
 raise () from /lib/libc.so.6
 abort () from /lib/libc.so.6
 __assert_fail () from /lib/libc.so.6
 ?? ()
 ?? ()
Title: tesseract assert failure: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare

Crashbit (crashbit-gmail) wrote :
garrison (jim-garrison) wrote :

This bug has also broken ocropus for me. I am running amd64 as well.

$ ocroscript recognize image.jpg
ocroscript: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
Aborted

Nemo157 (ghostunderscore) wrote :

This may be the same bug as here: http://code.google.com/p/tesseract-ocr/issues/detail?id=265#c0
If so that is supposed to be fixed for the 3.0 release.

arndtc (arndtc) wrote :

Hope that the 3.0 release comes soon. GOCR is not near as good of an alternative as tesseract.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in tesseract (Ubuntu):
status: New → Confirmed
Luzius Thöny (lucius-antonius) wrote :

deb packages of 3.0 are available here: http://notesalexp.net/oneiric/main/t/tesseract/

(works for me)

Jeff Breidenbach (jeff-jab) wrote :

Obsolete; Tesseract 3 is shipping with Ubuntu. Please close.

Jeff Breidenbach (jeff-jab) wrote :

Also, Ocropus is no longer shipping.

Changed in tesseract (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.