PDF import forget math sum symbols
Bug #742364 reported by
Mario Valle
This bug affects 4 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Inkscape |
Triaged
|
Medium
|
Unassigned |
Bug Description
0.48.0 r9654 on Windows 7
Imported PDF page with math formulas containing sum symbols (uppercase Greek sigma). Imported everything except this symbol.
to reproduce download http://
Thanks for looking!
mario
tags: | added: importing pdf |
Changed in inkscape: | |
status: | Confirmed → Triaged |
To post a comment you must log in.
Reproduced with Inkscape 0.47 and 0.48.1 (official packages) on OS X 10.5.8 (i386)
as well as with Inkscape 0.48+devel r10178 (with poppler 0.16.4 and cairo 1.10.2)
Notes:
1) Evince 2.30.2 displays the file correctly, but when selecting the text, it omits the sum signs (possibly indicating that those glyphs are created differently than other parts of the formulas)
2) GIMP on OS X (2.6.11) does read the uppercase sigma characters on import (using poppler as well, AFAIU)
3) the preview in the PDF import dialog correctly displays the sum signs
4) saving page 5 as SVG with Inkscape 0.47 or 0.48.0 creates invalid SVG files with improperly encoded content:
(inkscape:43131): Gtk-WARNING **: Unable to find default local directory monitor type blue/img/ Inkscape/ test/bug/ 742364- 1103.4807v1- p5-0480. svg:3192: parser error : Input is not proper UTF-8, indicate encoding !
id= "tspan4752" >?</tspan> </text>
^
/Volumes/
Bytes: 0x80 0x3C 0x2F 0x74
5) saving page 5 as SVG with Inkscape 0.48.1 or current trunk creates a valid SVG file, but omits the glyph used for the sum symbol. The text objects for the sum symbols exist, but are empty.
Seems related to
Bug #605872“pdf to svg fails with characters from Unicode Plane 1 (SMP)”
and its fix discussed in
Bug #369861“Unable to open previously imported pdf file”
(see all comments by Khaled Hosny who provided a patch to ensure that no invalid UTF-8 code is returned «with caveat that glyphs with no proper Unicode (unencoded glyphs) will be just omitted»).