Hocr has corrupted bounding boxes for images

Bug #548801 reported by warpitaly on 2010-03-26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cuneiform for Linux

Bug Description

When using HOCR output format the resulting file includes images as ocr_lines but always setting their bounding boxes to zeros, e.g.:

<span class='ocr_line' id='line_5' title="bbox 0 0 0 0"><img src=bug_files/1.bmp width=756 height=552 alt="bug_files/1.bmp">

Hence, it's not possible to correctly place images on the resulting page.

Command used (image file attached):
cuneiform -l ita -f hocr -o bug.htm Bug.png

warpitaly (giorgio-davanzo) wrote :
Juergen Weigert (jw-cs) wrote :

Also seen on suse 11.2 with cuneiform 0.9.0, whereas cuneiform 0.8.0 is correct.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions