not all objects shown in PDF file

Bug #519139 reported by jc
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
Confirmed
Undecided
Unassigned

Bug Description

I have opened a PDF file created in Adobe InDesign CS3 (5.0.4) using Adobe PDF Library 8.0. When opened with Inkscape 0.47, the PDF import settings window shows the entire image in the thumbnail, however after clicking ok not all the drawing objects are imported. They are not hidden, they simply are not there. I assume they are not bitmaps (I don't have the original InDesign file).

Tags: importing pdf
Revision history for this message
su_v (suv-lp) wrote :

Please add information about your OS and Inkscape version. Can you attach the file that exposes this problem so that we can try to reproduce the error?

tags: added: importing pdf
Changed in inkscape:
status: New → Incomplete
Revision history for this message
Adrien Cordonnier (adrien-cordonnier) wrote :

OS: Windows 7 Pro
Inkscape: 0.48.2

I experienced a similar bug with objects disappearing between the import preview and the resulting svg. Yet the objects not shown are indeed imported. It seems that they are hidden from another objet in a group, maybe a wrong z-index.

Here are the instruction to reproduce the bug:

1) Import page 16 from https://www.globalreporting.org/resourcelibrary/ConferenceProgram_2008.pdf.
2) In the page import preview, the title "GRI 2008" appears.
3) Once imported, the title does show anymore. No object can be selected at the expected location.
4) Select all objects [Ctrl-A]. Ungroup several times (here 5 times). The title appears as expected.

Revision history for this message
Alvin Penner (apenner) wrote :

this has been partially fixed in trunk. Attached is the result obtained for page 16 using Inkscape rev 11112

Changed in inkscape:
status: Incomplete → Confirmed
Revision history for this message
Alvin Penner (apenner) wrote :

there is still some text missing "Rabobank is among the most ..."

Revision history for this message
Alvin Penner (apenner) wrote :

in the file ConferenceProgram_2008, there is some missing test on page 16. The text begins with "Rabobank is among the most sustainable" and continues with the text 'www.rabobank.com'.
    This text is rendered correctly in Gimp and also in Evince. But it is not interpreted correctly by the program pdf2txt.py which is part of the parser called pdfminer.py : http://pypi.python.org/pypi/pdfminer/
    For example the text 'www.rabobank.com' gets interpreted by pdfminer as :
(cid:88)(cid:88)(cid:88)(cid:15)(cid:83)(cid:66)(cid:67)(cid:80)(cid:67)(cid:66)(cid:79)(cid:76)(cid:15)(cid:68)(cid:80)(cid:78)
where the letter 'a' is represented by '66'. It appears there is something unique about this line since the preceding text was being correctly interpreted by pdfminer.

Revision history for this message
Alvin Penner (apenner) wrote :

The missing text "Rabobank is among the most ..." has a font type known as Type2 CID Font, while the remaining, visible, text has a more normal 8-bit font Type1C. The Type2 CID Font is supported by Poppler using the name designation fontCIDType2, but is apparently not supported by the input routine pdf-parser.cpp which is being used to do this import.

Revision history for this message
Adrien Cordonnier (adrien-cordonnier) wrote :

Tested on Windows 8 with Inkscape 0.91 r13725. Importing without Poppler option, gives scrambled text (wrong characters in "Rabobank..." and wrong character spacing on the bottom half), "GRI2008" is shown correctly. With Poppler option, everything is shown correctly.

I suggest to close the bug as fixed. Indeed all objects are shown. Another bug report can be opened for scrambled text.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.