custom text encodings
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pyPdf |
New
|
Undecided
|
Unassigned |
Bug Description
In the attached hello_world.pdf file, the text is being drawn by the following TJ command (operand, operator):
[['\x01', -13.8818, '\x02', 17.0689, '\x03', -5.16561, '\x03', -5.16561, '\x04', 2.15722, '\x05', 3.67949, '\x06', 2.5903, '\x07', -0.30418, '\x04', 2.15654, '\x08', 2.23293, '\x03', -5.16561, '\t', 11.3788, '\n', 6.60875, '\x06']] 'TJ'
The bytes being drawn are using the font \R7 (['/R7', 10.7452] 'Tf').
The font \R7:
7 0 obj
<</BaseFont/
/FirstChar 1/LastChar 10/Widths[ 623 498 229 527 250 226 715 349 525 252]
/Encoding 12 0 R/Subtype/
endobj
Encoding:
12 0 obj
<</Type/
1/g44/g286/
endobj
PyPdf does not support reading a custom encoding from the document while drawing text, and therefore the extractText method does not return any text for this file.