Inkscape: A Vector Drawing Tool

Wrong characters in imported PDF

Reported by dopelover on 2009-07-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Medium
Unassigned

Bug Description

There is a problem with characters encoding after importing attached PDF into Inkscape.

Inkscape: r21714
Operating system: Ubuntu 9.04 32-bit

dopelover (dopelover) wrote :
dopelover (dopelover) wrote :
dopelover (dopelover) wrote :
Alvin Penner (apenner) wrote :

confirmed on Windows XP, Inkscape rev 21627

Changed in inkscape:
status: New → Confirmed
~suv (suv-lp) wrote :

duplicate of bug #369861: Unable to open previously imported pdf file?

steps:
1) open attached 'problematic_file.pdf'
2) save page as svg file
3) reopen 'problematic_file.svg' with Inkscape
4) error message: 'Failed to load the requested file /Volumes/blue/img/Inkscape/test/bug/394472-problematic_file-p1-LeWitt.svg'

console msgs:

/Volumes/blue/img/Inkscape/test/bug/394472-problematic_file-p1-LeWitt.svg:299: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xA0 0xA2 0xA0 0x50
             id="tspan2977">@BDFH���P</tspan><tspan
                                 ^

and tons of 'parser error: PCDATA invalid Char value…' like:
/Volumes/blue/img/Inkscape/test/bug/394472-problematic_file-p1-LeWitt.svg:3612: parser error : PCDATA invalid Char value 30
             id="tspan4515">

 &quot;

 </tspan><tspan
                                                  ^

~suv (suv-lp) wrote :

sorry, white space gets lost in the html-formatting of the comments, see attached logfile for the console output.

~suv (suv-lp) wrote :

another related bug #305741: inkscape crashes during pdf import?

steps:
1) open 2nd test file from bug #305741 <http://launchpadlibrarian.net/22546261/Sch%C3%A9ma%20%C3%A9olien%20-%20Avifaune.pdf> with inkscape
   (I tried both 'Replace PDF Fonts' options - made no difference re encoding)
2) save as svg
3) reopen in inkscape fails (no crash)

console msg:

/Volumes/blue/img/Inkscape/test/bug/305741-Avifaune-LeWitt.svg:32491: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xBF 0xE9 0x0D 0xAC
               id="tspan22342">��
                               ^

comments #5, #7: tested with Inkscape 0.46+devel r21714, OS X 10.5.7

tags: added: importing pdf
removed: pdf-import

workaround found!

I have also run into the same problem with Inkscape 0.47pre4 r22446, built Oct 15 2009 which is presently in ubuntu 9.10 (AMD64). Interestingly, I have found a workaround: open the PDF in Evince and print it to svg. Then when you open the file in Inkscape, fonts are OK (see attached file).

Maybe our beloved developers should take a look at how Evince does the trick, and implement it upon import of pdf... (I'm not skilled enough in programming to do this myself, unfortunately)

jazzynico (jazzynico) wrote :

Looks like the encoding issue reported in Bug #499257 (saves svg file, that it can't read afterwards).

Changed in inkscape:
importance: Undecided → Medium
dopelover (dopelover) wrote :

The bug is still present in Inkscape 0.47+devel r9404

dopelover (dopelover) wrote :

I opened just another PDF in inkscape and noticed, that that only diacritic signs are wrong encoded. There were no errors nor warnings while opening from terminal.

Inkscape Inkscape 0.47+devel r9441 on ububntu 10.04

~suv (suv-lp) on 2010-06-09
tags: added: encoding
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers