Generating a PDF splits the last line of text between two pages

Bug #822537 reported by David Wilson on 2011-08-08
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
RedNotebook
Undecided
Unassigned
Webkit
Confirmed
Medium

Bug Description

Generating a PDF splits the last line of text between two pages. That is the top half of the line of text (character ascenders) is on one page whilst the bottom half ( character decenders) is on the second page.

Safari does not seem to support the CCS2 print styles page-break-before,
page-break-after and page-break-inside.

There seems to be partial support for page-break-before and -after, but no
support at all for -inside. Support for these properties are crucial if you want
to make a web page print in a predictable way. It is especially annoying that
page-break-inside:avoid isn't supported, because that rule alone would go a long
way towards making nice printouts when you don't know what font size the user
will have set, or when you have dynamic content.

Created attachment 4045
3 files to demonstrate this bug

This have now been implemented in KHTML 3.5, the patch however is quite
intrusive and would probably be a lot of work to port.

Reassigning to webkit-unassigned, to make sure more people see this.

Confirmed.

Here's a testcase for page-break-inside: avoid:

http://www.gtalbot.org/BugzillaSection/Bug132035_Page_Break_Inside_Avoid.html

Safari 2.02 (416.13) fails to render that testcase correctly in Print preview.

I'd like to see this bug fixed. Many of sites that print boarding passes online use page-break-* between multiple boarding passes. NWA uses page-break-after for example.

> Many of sites that print boarding passes
> online use page-break-* between multiple boarding passes. NWA uses
> page-break-after for example.

"
Step 3:
Print your boarding pass on an 8.5" X 11" sheet of paper ...and go Straight to the Gate!
"
http://www.nwa.com/checkin/

We support before/after mostly, but not inside. I'll retitle the bug since I assume -inside is what you're referring to. I didn't think IE supported page-break-inside though, so are sites really using it?

Created attachment 18747
page-break-after fails in this example

Actually I was referring to page-break-after which does not work in this somewhat reduced example from the NWA site. It prints on two pages in Camino 1.5.4, but prints on one page with WebKit TOT as of Jan 26, 2008.

Retitling per Comment #9.

WebKit on Windows/Linux. page-break-after: always cuts very first text on next page. Here is an example code:

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>TEST page-break.</title>
</head>
<body>

<p>Page 1 of 2</p>

<p style="page-break-after: always"></p>

<p>Page 2 of 2</p>

</body>
</html>

but if you add &nbsp to <p> like this:

<p style="page-break-after: always">&nbsp</p>

works OK.

This problem doesn't exist in FF so I gather is WebKit issue.

Regards

Created attachment 39635
Simple patch to handle 'page-break-inside: avoid' in a similar manner to page-break-before and page-break-after.

This is a simple patch to handle the CSS property 'page-break-inside: avoid' in a similar manner to page-break-before and page-break-after.
Its clear that the page-break handling required some major rework. But this might be sufficient in the mean time.

I've just been using the tool wkhtmltopdf to generate documents from a project database. wkhtmltopdf uses WebKit to produce PDF file for URLs.
I ran into similar issues. After a little digging, I discovered the problem is in WebKit itself. WebKit recognises page-break-inside but does not do anything with it.
In fact the whole page break handling in WebKit is very limited, as is evidenced by the following comment in RenderFlow.cpp:

// FIXME: This is a feeble effort to avoid splitting a line across two pages.
// It is utterly inadequate, and this should not be done at paint time at all.
// The whole way objects break across pages needs to be redone.
// Try to avoid splitting a line vertically, but only if it's less than the height
// of the entire page.

Looking at how the page-break-before and page-break-after are handled in RenderBlock.cpp I put together a patch to handle page-break-inside:avoid in a similar manner. As with -before and -after -inside works only on non-floating blocks and does not deal with multiple columns correctly. But it should be sufficient until the the page-breaks are properly dealt with.

wkhtmltopdf uses the WebKit verion included in the Qt source distribution. I also looked at the latest stable version of WebKit and there appears to have been no change in the page-break handling as yet. wkhtmltopdf also includes other Webkit and Qt patches that may be of interest.

<rdar://problem/3491400>

As for page-break-after failure,

page-break-after is applicable to block level elements
http://www.w3.org/TR/css3-page/

I confirmed that WebKit
- works properly for <p>, <div>, and <hr>, which are all block level elements,
- but doesn't work for <span> and <br>, which are inline.

Firefox 3.5.7 breaks pages for all of above.

Internet Explorer 7.0 breaks pages for all except <span>.

What should WebKit do?

(In reply to comment #12)
> I've just been using the tool wkhtmltopdf

Me too and loving it.

However this page break issue is a real problem for being able to use it seriously (which I would very much like to do), especially since this as far as I can tell is the reason why css3 columns (aka multiple columns) can't yet be printed (bug 15546).

I wish I had the chops to actually dive into this and help, but hopefully my comment might help persuade someone that fixing this would enable WebKit to be a very cool and serious print rendering engine.

I tried using this on headers (which are also block elements), and it does not seem as if it works there either.

My layout is:
<h2>
<p>

And I regurly have pagebreaks between the h2 element and the elements following the h2. If this is fixed, then it could greatly improve printing of Wikipedia pages.

"Trivial" to fix in the new printing code:

--- a/Source/WebCore/rendering/RenderBlock.cpp
+++ b/Source/WebCore/rendering/RenderBlock.cpp
@@ -5952,7 +5952,8 @@ int RenderBlock::applyAfterBreak(RenderBox* child, int logicalOffset, MarginInfo

 int RenderBlock::adjustForUnsplittableChild(RenderBox* child, int logicalOffset, bool includeMargins)
 {
- bool isUnsplittable = child->isReplaced() || child->scrollsOverflow();
+ bool isUnsplittable = child->isReplaced() || child->scrollsOverflow() ||
+ child->style()->pageBreakInside() == PBAVOID;
     if (!isUnsplittable)
         return logicalOffset;
     int childLogicalHeight = logicalHeightForChild(child) + (includeMargins ? marginBeforeForChild(child) + marginAfterForChild(child) : 0);

(at least that works fore me, might need some more sanity checks)

*** Bug 36822 has been marked as a duplicate of this bug. ***

Jendrik Seipp (jendrikseipp) wrote :

Thanks for the report! Can you provide an example PDF? Maybe even the input that generated it? If you can only replicate it with private data not meant for launchpad, you can also send it to me via e-mail.

Changed in rednotebook:
status: New → Incomplete
Jendrik Seipp (jendrikseipp) wrote :

Thanks for the example files. Apparently this is a webkit bug and we will just have to wait for it to be fixed by the webkit devs.

Changed in rednotebook:
status: Incomplete → Confirmed
Changed in webkit:
importance: Unknown → Medium
status: Unknown → Confirmed

Attachment 39635 did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files']" exit_code: 1

Total errors found: 0 in 0 files

If any of these errors are false positives, please file a bug against check-webkit-style.

Jendrik Seipp (jendrikseipp) wrote :

Could you try and export something (that gets split in the PDF export) to HTML, please? Then open the HTML file in Firefox and try printing it (maybe to PDF) there. Does Firefox also cut the text?

Changed in rednotebook:
status: Confirmed → Incomplete
Changed in rednotebook:
status: Incomplete → Confirmed

The patch in comment #17 does seem to do the trick for me when it comes to "page-break-inside: avoid". It seems to resolve bug 35217, no? What would be required to get this upstream?

Are there unit tests for printing? If so, I can try to write one for this, if someone can point me at some existing test code to emulate.

Cheers

summary: - Generating a PDF splits the last line of text between twp pages
+ Generating a PDF splits the last line of text between two pages
Jendrik Seipp (jendrikseipp) wrote :

This bug is still present on Ubuntu 12.10 with RedNotebook 1.6.4.

Jendrik Seipp (jendrikseipp) wrote :

Curiously the bugreport here says the bug has been fixed: https://bugs.webkit.org/show_bug.cgi?id=65005

Nick (nick222-yandex) wrote :

I see it: Xubunti 12.10 RNB 1.6.6

When I load

http://www.gtalbot.org/BugzillaSection/Bug132035_Page_Break_Inside_Avoid.html

in Chrome 25.0.1364.172 (under Linux KDE 4.10.1, i686, 32 bits), the test is PASSED in print preview for both Portrait and Landscape page layouts.

(In reply to comment #14)
> As for page-break-after failure,
>
> page-break-after is applicable to block level elements
> http://www.w3.org/TR/css3-page/

page-break-after is now break-after:

"
UAs that conform to [CSS21] must alias the ‘page-break-before’, ‘page-break-after’, and ‘page-break-inside’ properties to ‘break-before’, ‘break-after’, and ‘break-inside’ by treating the ‘page-break-*’ properties as shorthands for the ‘break-*’ properties (...)
"
3.3. Page Break Aliases: the ‘page-break-before’, ‘page-break-after’, and ‘page-break-inside’ properties
http://www.w3.org/TR/2012/WD-css3-break-20120823/#page-break

and break-after still does *not* apply to inline-level elements:
break-after
http://www.w3.org/TR/2012/WD-css3-break-20120823/#break-after

> What should WebKit do?

break-after (or page-break-after) should not apply to inline-level elements.

Gérard

Attachment 39635 did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files']" exit_code: 1
Total errors found: 0 in 0 files

If any of these errors are false positives, please file a bug against check-webkit-style.

BUG: "page-break-inside: avoid"

There is a bug for Chrome when div elements are in a "table".
Try the code below, and you will see :
   - at pages 3,9 and 15: some unexpected space between first and second div;
   - at pages 4,10 and 16: some unexpected space between second and third div;
   - at pages 5,11 and 17: only 2 divs are visible instead of 3;

There is a cyclic behavour every 6 pages.

Thank you.

**********************************************************
<html>
 <head>
  <style>
   table {
    width: 100%;
   }
   div {
    height: 270px;
    border: solid 2px;
    margin-bottom: 10px;
    page-break-inside: avoid;
   }
  </style>
 </head>
 <body>
  <table><tbody>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
   <tr><td><div>Hello</div></td></tr>
  </tbody></table>

 </body>
</html>

(In reply to comment #23)
> BUG: "page-break-inside: avoid"

hus.buy,

You most probably should be making such comment into
Bug 35217 - CSS attribute page-break-inside: avoid; is not implemented.
and not in here.

>
> There is a bug for Chrome

Which version of Chrome? Under which operating system?

Bug Reporting Guidelines
https://www.webkit.org/quality/bugwriting.html

> when div elements are in a "table".
> Try the code below, and you will see :
> - at pages 3,9 and 15: some unexpected space between first and second div;

Can you measure such vertical gap? Or can you provide a screen shot?

> - at pages 4,10 and 16: some unexpected space between second and third div;
> - at pages 5,11 and 17: only 2 divs are visible instead of 3;
>
> There is a cyclic behavour every 6 pages.
>
> Thank you.
>
> **********************************************************
> <html>

Best is to provide a test and make it accessible (attachment or make it available somewhere on the web); your test should trigger web standards compliant rendering mode and not backward-compatible "quirks" rendering mode. You can do this by starting your test like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

<html>

 <head>

> <head>
> <style>

I suggest to use instead

  <style type="text/css" media="print">

Other points to consider:

Identify
- page scale (it should be 100%),
- paper size (like US letter [216mm wide by 279mm tall; 8½inches by 11inches] or A4 [210mm wide by 297mm tall])
- page orientation (like Portrait)

You may have (or be asked) to provide furthermore details if the problem is not reproducible: details like what are your vertical page box margins in millimeters where page header and page footer are usually printed. Vertical (and horizontal) page box margins are entirely user-settable in Chrome 27 print preview mode.

> table {
> width: 100%;
> }
> div {
> height: 270px;
> border: solid 2px;
> margin-bottom: 10px;
> page-break-inside: avoid;
> }
> </style>
> </head>
> <body>
> <table><tbody>
> <tr><td><div>Hello</div></td></tr>

I created 2 quick versions of your test and I could not reproduce the problem you experienced with Chrome 27.0.1453.110 under Linux KDE 4.10.4, kernel version 3.8.0-25-generic, i686 (32bits). By this, I am *_not_* suggesting you are not experiencing the problem you see or that you should not report the problem you see.

Gérard

The problem can be reproduced in latest Google Chrome (32.0.1700.107 m)

Created attachment 223130
"page-break-inside: avoid" is ignored in Chrome

Created attachment 223131
"page-break-inside: avoid" - works in Firefox 27

Created attachment 223133
"page-break-inside: avoid" - works in IE11

Created attachment 223134
reproduce "page-break-inside: avoid" problem (html+css)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.