Hocr has corrupted bounding boxes for images

Bug #548801 reported by warpitaly
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cuneiform for Linux
New
Undecided
Unassigned

Bug Description

When using HOCR output format the resulting file includes images as ocr_lines but always setting their bounding boxes to zeros, e.g.:

<p>
<span class='ocr_line' id='line_5' title="bbox 0 0 0 0"><img src=bug_files/1.bmp width=756 height=552 alt="bug_files/1.bmp">
</span>
</p>

Hence, it's not possible to correctly place images on the resulting page.

Command used (image file attached):
cuneiform -l ita -f hocr -o bug.htm Bug.png

Revision history for this message
warpitaly (giorgio-davanzo) wrote :
Revision history for this message
Juergen Weigert (jw-cs) wrote :

Also seen on suse 11.2 with cuneiform 0.9.0, whereas cuneiform 0.8.0 is correct.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.