Imported metadata is truncated in conversion

Bug #1848327 reported by John Dalstead on 2019-10-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

After importing metadata during a conversion the data may be truncated in the output file.
Looking in the html view of the metadata (I am no expert) the truncation occurs where the html <br> occurs.
I have been simply the <br> to </p>. The output then may not be pretty but can be edited in the text view and the metadata is intact after conversion. No <br> no problem. Any metadata containina <br> will fail at those points.

Calibre 4.1
O/S Windows 10
Browser Chrome or Firefox.

Thanks for a great product,
John Dalstead.

Conversion to what and can you provide a minimal example of a comment
that gets truncated.

 status incomplete

Changed in calibre:
status: New → Incomplete
John Dalstead (johndal36) wrote :
Download full text (3.4 KiB)

Hi David,
Not having much joy replying to your message.
Will this work?

As requested - a demo.

Loaded random ePub file, Aesop's Fables. No metadata, so downloaded so downloaded this: "
<div>
<p>9781434001467 </p>

<p>Aesop's Fables is a collection of fables credited to Aesop, a slave and story-teller believed to have lived in ancient Greece between 620 and 560 BCE. Of diverse origins, the stories associated with Aesop's name have descended to modern times through a number of sources. They continue to be reinterpreted in different verbal registers and in popular as well as artistic mediums. In Classical times there were various theorists who tried to differentiate these fables from other kinds of narration. They had to be short and unaffected; in addition, they are fictitious, useful to life and true to nature. In them could be found talking animals and plants, although humans interacting only with humans figure in a few. Typically they might begin with a contextual introduction, followed by the story, often with the moral underlined at the end. Setting the context was often necessary as a guide to the story's interpretation, as in the case of the political meaning of The Frogs Who Desired a King and The Frogs and the Sun. Sometimes the titles given later to the fables have become proverbial, as in the case of 'killing the Goose that Laid the Golden Eggs or the Town Mouse and the Country Mouse. In fact some fables, such as The Young Man and the Swallow, appear to have been invented as illustrations of already existing proverbs. One theorist, indeed, went so far as to define fables as extended proverbs. In this they have an aetiological function, the explaining of origins such as, in another context, why the ant is a mean, thieving creature. Other fables, also verging on this function, are outright jokes, as in the case of The Old Woman and the Doctor, aimed at greedy practitioners of medicine.</p></div>

Modified html source: " In this they have an aetiological function, the explaining of origins such as, in another context, why the ant is a mean, thieving creature.<br>Other fables, also verging on this function, are outright jokes, as in the case of The Old Woman and the Doctor, aimed at greedy practitioners of medicine.</p></div>."
Note added <br>. In normal view there is a paragraph break at this point.

Convert the ePub back to ePub (my default is to format for Kobo reader with most heuristic options on if I am not sure about the source). The metadata in the output file ends like this:
"Sometimes the titles given later to the fables have become proverbial, as in the case of 'killing the Goose that Laid the Golden Eggs or the Town Mouse and the Country Mouse. In fact some fables, such as The Young Man and the Swallow, appear to have been invented as illustrations of already existing proverbs. One theorist, indeed, went so far as to define fables as extended proverbs. In this they have an aetiological function, the explaining of origins such as, in another context, why the ant is a mean, thieving creature."
Everything after <br> has been truncated.

Most metadata probably does not need or include <br>. Last week I was collecting so...

Read more...

Kovid Goyal (kovid) wrote :

I cannot replicate this. How are you checking the metadata in the output
file? If I open the output file using the calibre viewer and click Go
to->Metadata I see no truncation and the line break is preserved.

John Dalstead (johndal36) wrote :

This may be a problem mainly with the Calibre and some other viewers.

The text in the metadata appears to be complete after conversion but is not displayed in the Calibre viewer. That is where I got the truncated text copied into the earlier message.
I use Freda sometimes but have not found a way to display metadata, nor with Fly Reader.
The Chrome ebook viewer displays the full text.
The Kobo app truncates the text.
I have a Kobo reader.
John

Kovid Goyal (kovid) wrote :

Can you attach the epub file where you see truncated text?

John Dalstead (johndal36) wrote :

This file displays truncated in Calibre viewer and Kobo app.
All metadata text is displayed in Chrome ebook reader.

Kovid Goyal (kovid) wrote :

Works for me, see attached screenshot.

John Dalstead (johndal36) wrote :

It seems I can only attach one file at a a time.
I have 3 snips. Calibre attached - truncated
Kobo I can send - truncated
Chrome - not truncated. I expect that you will believe me but they are waiting if you want them.

I am a fan of, among other things, Golden Age SF. I was a young fellow then and the habit has stayed with me. I will be 83 in a few days and a few grey cells are still working. For my own satisfaction I am hunting for as many of the less common stories so the files I am working with are of variable quality.

Most have no metadata via Calibre, some have heaps. Ones from collections with a list of titles for instance will likely have <br> scattered in the text and truncate at the first instance. And Kobo is my personal reader.

Not the end of the world but annoying.
John,

Kovid Goyal (kovid) wrote :

Ah you mean in the jacket page, not Goto->Metadata, yes that is indeed a bug.

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

 status fixreleased

Changed in calibre:
status: Incomplete → Fix Released
John Dalstead (johndal36) wrote :

Great David. I still talk 1960s computer and other stuff. What are folders? Give me a directory any day. As for using the current words for a Google search. I fail every time. Sorry if I confused you. Blame it on age.

Thanks for the update.
John.

John Dalstead (johndal36) wrote :

Looking back I see that I owe an apology. It seems I had autocorrect turned on, I mostly have it off.
It apparently decided that your name should be David and I never noticed. Sorry about that Kovid.
John

No worries.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers