Comment 2 for bug 1440304

Revision history for this message
iostrym (armandooooo) wrote : RE: [Bug 1440304] Re: calibre bug 1440304

Thanks a lot for answering so quickly this report. I don't even know if this answer will be logged somewhere ...

I found what gives the strange behavior. When saving the pdf with xchange I have strange behavior with tag when importing in calibre and when I open and save the "wrong" pdf with adobe reader, the import will be ok regarding tags.

both pdf version are 1.6

after extracting metadata with exiftool, I notice that

import ko pdf : linearized (no) and XMP Toolkit = XMP Core 4.1.1
import ok pdf : linearized (yes) and XMP Toolkit = Adobe XMP Core 5.4

don't know if calibre matters the version of XMP Toolkit. But for sure, there is something in metadata that Calibre don't like with PDF XChange.

By the way, do you know which XMP is read by Calibre when importing the Pdf (when no calibre xmp metadata are available) ?

http://ns.adobe.com/* ones ?
or
http://purl.org/dc/elements/1.1

As I understood, after a calibre export, calibre will use its own XMP metadata : http://calibre-ebook.com/xmp-namespace for "custom" metadata. but for common metadata standart XMP are used (adobes or dc I don't know). Because I don't see any title,author nor tag in calibre metadata.

And there is something strange anyway because :
- add tag in calibre : toto, titi
- exporting a PDF in calibre
- re-import the PDF in calibre => tag are concatened with "_" : toto_titi (one tag)

Best regards,

Armandooooo

> Date: Sat, 4 Apr 2015 13:39:03 +0000
> From: <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1440304] Re: calibre bug 1440304
>
> I dont see much point in this. PDF supports the XMP metadata standard.
> Simply use a PDF metadata editor that supports XMP, such as calibre
> itself (the ebook-meta command line tool from calibre). calibre prefers
> XMP metadata over the Info dict, unless the latter has a newer mod date.
>
> See the metadata_from_xmp_packet() function in the calibre source code
> for how exactly XML metadata is mapped to calibre metadata.
>
> status wontfix
>
> ** Changed in: calibre
> Status: New => Won't Fix
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1440304
>
> Title:
> [Enhancement] configure metadata import when importing pdf file in
> calibre
>
> Status in calibre: e-book management:
> Won't Fix
>
> Bug description:
> In 2.23 on win7 64 bits, when importing a pdf in calibre, some common metadata in pdf file can be read by calibre to be imported in calibre metadata.
> for example : title, author and tag are imported. Also subject metadata is put in comment.
>
> By testing I saw that :
>
> - first line of subject is put in calibre tag (pdf subject can set in many lines using some pdf editor)
> - full subject (including others lines) are put in calibre comment
> - tag must be separated by comma.
>
> But maybe this import feature is described somewhere ?
>
> Something great would be for example
> - to configure the "separator" used between tags because some pdf editor don't support comma and want ";"
> - to be able to disable de first line import in subject for tags
> - to be able to customize which calibre metadata is written using which pdf metadata :
> ie : published date is first line of subject
> isbn is second line of subject
> others lines of subject are comment
>
> I would be happy to help if I was showed where this is done in code...
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/calibre/+bug/1440304/+subscriptions