calibre: e-book management

conversion to mobi ignores vertical-align:top and superscript font size

Reported by Lee on 2011-04-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

Doing some testing for Alexbell from this thread:
http://www.mobileread.com/forums/showthread.php?p=1488856

I think I've discovered a bug in conversion of various types of superscript. vertical-align:top when combined with a font size/line height reduction is not included in the list of candidates to be considered a superscript. A lot of epub authors do this to eliminate problems with line height being made variable.

On top of that I was under the impression that if you used <sup>/<sub> tags that Calibre would just trust that the author/publisher actually intended for the tag to contain superscript. Instead it seems that the name of the tag is ignored and only css values are checked, as the resultant mobi file loses its <sup> tags.

Beyond that it seems that font size is ignored for superscript tags, they always seem to get <font size="2"> no matter what the user has actually specified for the font size.

I've got a candidate fix for the line height/vertical-align problem, but my branch is a mess so it will take me a bit to post. Haven't dug into the problem with the font size being ignored.

File is attached showing examples. All superscripts are lost except the ones which DO NOT have vertical align css.

Related branches

Lee (ldolse) wrote :
John Schember (user-none) wrote :

> I've got a candidate fix for the line height/vertical-align problem, but my branch is a mess so it will take me a bit to post.

You can always create another branch for this change and leave your normal branch alone. Check out trunk to another directory on your computer. Make the changes. Push them up to another branch.

Does the Kindle renderer even respect font size inside sup/sub tags? You
should be able to easily check this by hardcoding a font size tag inside <sup>
in mobiml.py.

If you want <sup>x<sub>y</p> could mean that y should be a subscript of x and
the author forgot to close the tags. I'd rather not play these kinds
of guessing games.

Lee (ldolse) wrote :

I'll check that - I hadn't seen where to hardcode it before.

Regarding your example of '<sup>x<sub>y</p>', fully agree with you around not playing guessing games, not sure how I intimated that, it wasn't my intention to say that Calibre should do anything special around ambiguous markup like that. If it was the sentence where I wrote '<sup>/<sub>', I meant if the author uses <sup>x</sup> or <sub>y</sub>, that the css styles take precedence over the actual tag names. In the attached epub all the <sup> tags which use the vertical-align:top style are eliminated during conversion. Based on the code I think <sub> gets eliminated in the same way.

@John, creating a new branch was pretty much my plan.

Kovid Goyal (kovid) wrote :

CSS taking precedence over tag names is the way CSS is supposed to work. The
way modern browsers render HTML is that they have a default stylesheet that is
applied to all tags, then any CSS specified in the document overrides that
default stylesheet. Then they render the result based on the CSS, ignoring tag
names. That is the way calibre works as well.

Oh and if you need to generate a localized patch you can simply do:

bzr diff path/to/changed file > mypatch.diff

Lee (ldolse) wrote :

True, I'm not arguing that in general, except for the fact that we're moving from a format where that's true to a format that doesn't support css. And the way the code is written in essence Calibre is trying to guess from various css settings what is meant to be <sub> or <sup> in mobiml. However in case Calibre guesses wrong I should think that the tag name itself is a valid fallback, since these tags exist in modern html have special meanings which mirror mobiml. Right now if Calibre guesses wrong the information is lost, which is what's happening in this example doc.

Note for the fix I'm trying to include the parameters for the this style of subscript, but I was also thinking to use the tag names as a fallback. I've already implemented the fallback based on tag names, still working on the paramaters for this type of style, as my first cut was a bit naive.

Kovid Goyal (kovid) wrote :

The calibre conversion pipeline will be setting default CSS for sup/sub tags.
Only if that is overriden by the input document/user to something mobiml.py
does not recognize will there be a problem. The fix for that is to have
mobiml.py interpret a wider range of CSS, not have it fallback to using tag
names. The idea is that the MOBI output whenn rendered by the Kindle should
look as much like the input OEB document would look, if rendered by a
standards compliant browser. Obviously, this is often impossible, in which
case, we try to degrade as "gracefully" as we can.

Lee (ldolse) wrote :

No problem, will go that route. Regarding the font size, I was unable to figure out exactly how you meant to hard-code the font size, but I took an alternate route and used Kindlegen to see what it code it created.

You're correct that the Kindle renderer seems to ignores standard font size settings four sup/sub (at least using the Kindle previewer app).

This is the code Calibre creates:
<sup><font size="2">[1]</font></sup>

And this is the code KindleGen creates:
<sup><font size="-1">[1]</font></sup>

Both render the same.

However, I had also attempted to use the <small> tag in the source epub, and it turns out the Kindle renderer supports this:
<sup><font size="-1"><small>[1]</small></font></sup>
 Note the above is from Kindlegen as Calibre presently deletes the 'small' tag during conversion in its' pipeline.

Seems to reduce the font size ~15 - 20%. Nesting multiple <small> tags didn't seem to reduce things any further. Adding <small> should probably be a default unless the user explicitly configured sub/super to be the same font size as the rest of the text. I'm not familiar enough with the mobiml code to figure out how to insert this tag myself though.

Fixed in branch lp:calibre. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers