tesseract Segmentation fault
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tesseract (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
1) Description: Ubuntu 10.04.1 LTS
Release: 10.04
2) tesseract-ocr:
Installed: 2.04-2
Candidate: 2.04-2
3,4) I ran tesseract on a tiff file and received the following in the console:
Tesseract Open Source OCR Engine
Segmentation fault
and no output file was produced.
Running gdb revealed the following:
"
(gdb) run
Starting program: /usr/bin/tesseract 1-59.\ Bond\ Interest.tif out -l eng
[Thread debugging using libthread_db enabled]
Tesseract Open Source OCR Engine
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b9cf18 in JPEGDecodeRaw (tif=<value optimised out>, buf=0xffff0001 <Address 0xffff0001 out of bounds>, cc=<value optimised out>, s=255) at tif_jpeg.c:1027
1027 JSAMPLE *inptr = sp->ds_
(gdb) up
#1 0x00007ffff7bafc06 in TIFFReadScanline (tif=0xa0a5b0, buf=0xa0d0d0, row=<value optimised out>, sample=255) at tif_read.c:106
106 e = (*tif->
(gdb) up
#2 0x000000000044d919 in read_tiff_image (tif=0xa0a5b0, image=0x7ffffff
227 TIFFReadScanlin
(gdb) down
#1 0x00007ffff7bafc06 in TIFFReadScanline (tif=0xa0a5b0, buf=0xa0d0d0, row=<value optimised out>, sample=255) at tif_read.c:106
106 e = (*tif->
(gdb) down
#0 0x00007ffff7b9cf18 in JPEGDecodeRaw (tif=<value optimised out>, buf=0xffff0001 <Address 0xffff0001 out of bounds>, cc=<value optimised out>, s=255) at tif_jpeg.c:1027
1027 JSAMPLE *inptr = sp->ds_
(gdb) print sp
$1 = (JPEGState *) 0xa0af30
(gdb) print sp->scancount
$2 = 0
(gdb) print vsamp
$3 = 65535
(gdb) print ypos
$4 = 13478
(gdb) print ci
$5 = <value optimised out>
(gdb) print ds_buffer
No symbol "ds_buffer" in current context.
(gdb) print sp->ds_buffer
$6 = {0xa10a50, 0xa10ad0, 0xa10b10, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) print sp->scancount*vsamp + ypos
$7 = 13478
(gdb) print sp->ds_buffer[ci]
$8 = (JSAMPARRAY) 0xa10a50
(gdb) print vsamp
$9 = 65535
(gdb) print sp->ds_
$10 = (JSAMPROW) 0x0
(gdb) print nrows
$11 = 16
(gdb) up
#1 0x00007ffff7bafc06 in TIFFReadScanline (tif=0xa0a5b0, buf=0xa0d0d0, row=<value optimised out>, sample=255) at tif_read.c:106
106 e = (*tif->
(gdb) down
#0 0x00007ffff7b9cf18 in JPEGDecodeRaw (tif=<value optimised out>, buf=0xffff0001 <Address 0xffff0001 out of bounds>, cc=<value optimised out>, s=255) at tif_jpeg.c:1027
1027 JSAMPLE *inptr = sp->ds_
(gdb) up
#1 0x00007ffff7bafc06 in TIFFReadScanline (tif=0xa0a5b0, buf=0xa0d0d0, row=<value optimised out>, sample=255) at tif_read.c:106
106 e = (*tif->
(gdb) print buf
$12 = (tdata_t) 0xa0d0d0
(gdb) down
#0 0x00007ffff7b9cf18 in JPEGDecodeRaw (tif=<value optimised out>, buf=0xffff0001 <Address 0xffff0001 out of bounds>, cc=<value optimised out>, s=255) at tif_jpeg.c:1027
1027 JSAMPLE *inptr = sp->ds_
(gdb) print buf
$13 = (tidata_t) 0xffff0001 <Address 0xffff0001 out of bounds>
(gdb) print sp->cinfo.
$14 = 16
"
Could this be related to the comment
/* increment/decrement of buf and cc is still incorrect, but should not matter
* TODO: resolve this */
on line 1086 of tif_jpeg.c?
Running identify on the image used gives the following:
Format: TIFF (Tagged Image File Format)
Class: DirectClass
Geometry: 2386x3489+0+0
Resolution: 300x300
Print size: 7.95333x11.63
Units: PixelsPerInch
Type: TrueColor
Base type: TrueColor
Endianess: MSB
Colorspace: RGB
Depth: 8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
red:
min: 0 (0)
max: 255 (1)
mean: 237.852 (0.932754)
standard deviation: 59.4293 (0.233056)
kurtosis: 10.1826
skewness: -3.43737
green:
min: 0 (0)
max: 255 (1)
mean: 237.781 (0.932475)
standard deviation: 59.6937 (0.234093)
kurtosis: 10.1691
skewness: -3.43553
blue:
min: 0 (0)
max: 255 (1)
mean: 237.729 (0.93227)
standard deviation: 59.5565 (0.233555)
kurtosis: 10.1626
skewness: -3.43429
Image statistics:
Overall:
min: 0 (0)
max: 255 (1)
mean: 178.341 (0.699375)
standard deviation: 115.162 (0.451616)
kurtosis: -1.21747
skewness: -0.871093
Rendering intent: Undefined
Interlace: None
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
Compose: Over
Page geometry: 2386x3489+0+0
Dispose: Undefined
Iterations: 0
Compression: JPEG
Orientation: TopLeft
Properties:
date:create: 2010-08-
date:modify: 2010-08-
jpeg:
signature: 47b586a7ef05b56
tiff:
tiff:
tiff:software: xsane
tiff:timestamp: 2010:08:20 23:02:46
Artifacts:
verbose: true
Tainted: False
Filesize: 707KiB
Number pixels: 7.939MiB
Pixels per second: 39.7MiB
User time: 0.200u
Elapsed time: 0:01.200
Version: ImageMagick 6.5.7-8 2009-11-26 Q16 http://
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: tesseract-ocr 2.04-2
ProcVersionSign
Uname: Linux 2.6.32-24-generic x86_64
Architecture: amd64
Date: Sun Aug 22 00:39:59 2010
ProcEnviron:
PATH=(custom, user)
LANG=en_GB.UTF-8
SHELL=/bin/bash
SourcePackage: tesseract
Changed in tesseract (Ubuntu): | |
status: | Confirmed → Fix Released |
Status changed to 'Confirmed' because the bug affects multiple users.