Comment 13 for bug 623438

Revision history for this message
MAndree (hapm) wrote : Re: Font size not correct in merged sandvich PDF

Same problem here on ubuntu 10.04 with cuneiform 1.0.0 and hocr2pdf 0.7.4. I compared the information in the hocr file with the position of the text in the pdf, and whatever hocr2pdf does, the text in the pdf doesn't match the boundingboxes defined in the .hocr file. So i think this is a problem of hocr2pdf. Not sure if this is related to how the hocr output of cuneiform is formatted, as i have read that there are many ways to attach the boundingboxes to the text (using own tags, using attributes in the tag enclosing the text directly, ...). Would be nice to know if hocr2pdf can handle hocr output from other ocr engines atm, and if so, where their hocr files are different to the cuneiform output.