too many ids in input causes zoomed out converted pdf

Bug #1854345 reported by jimbojw on 2019-11-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

calibre version 4.4 (git head at 8db03fd0c8 using CALIBRE_DEVELOP_FROM)
OS: ubuntu

Scenario: using ebook-convert to convert an HTML or epub document to PDF.

Problem: When there are too many id attributes on elements in the input epub or HTML file, the rendered PDF appears 'zoomed out'. The most obvious effect of this is that the font size appears to be smaller than intended, but other elements are reduced as well.

The example files attached demonstrate the problem. Each file has 100 paragraphs of content. The only difference between working.html and broken.html is that the latter has an id attribute on each paragraph element.

Commands:

  ebook-convert working.html working.pdf --pdf-default-size=48
  ebook-convert broken.html broken.pdf --pdf-default-size=48

Additional context: Changing the types of elements that have the ids does not make a difference. I discovered this problem initially because of id attributes on list-item elements (<li>) but it happens on every kind of element that I've tried. I also tried switching to anchor tags (<a>) and using the deprecated 'name' attribute instead, but that exhibits the same problem.

I tried force the body of the document to have a fixed width in CSS, but this did not stop the problem from happening. While capping the width of elements does stop them from growing, the width is scaled by the zoom factor of the bug.

The effect seems to be progressive within certain bounds. For example, if there are a small number of elements with ids, then the problem does not occur. As you add more ids, after some threshold (not exactly sure how many) the problem starts to manifest. The zoom factor gets incrementally worse as you add more ids up to a point. After some second threshold, additional elements with ids do not seem to make the zoom factor any worse.

jimbojw (jimbojw) wrote :
jimbojw (jimbojw) wrote :
jimbojw (jimbojw) wrote :
jimbojw (jimbojw) wrote :
jimbojw (jimbojw) wrote :

In the example input files, I've added a 1em border around the paragraph elements to show that that is also affected by the scale effect, not just the literal font size of the text.

jimbojw (jimbojw) wrote :

When converting an epub to PDF, the problem is limited to only those files that have many ids. Files with few ids are unaffected. So for example, if you have one XHTML file per chapter in the epub, the generated PDF may have different scale factors on a per chapter basis (ids depending).

Additionally, the scale factor does not appear to affect the PDF headers and footers when using those options.

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released

Was fairly obvious since you recognized it had to do with the number of
links. It's surprising that despite white-space: pre-wrap it does not
wrap, but the fix does not cost much so...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers