Parsing the RML with zope.interface takes longer than rendering the PDF with ReportLab

Bug #1142635 reported by Kyle MacFarlane
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
z3c.rml
Fix Committed
Undecided
Unassigned

Bug Description

I was looking into why some RML documents were significantly slower to render than others and the slowness seems to come from zope.interface and zope.schema.

Generating a ~100 page PDF with a small table on each page takes the following:

Reading the XML and building the flowables: 13.933 seconds
Rendering the flowables with ReportLab: 7.015 seconds

The problem is that a new class is created for every element and every possible attribute of that element.

Some of the big time wasters:

1.491 seconds for 2.1 million calls to zope.interface.interface.py.getDescriptionFor
2.125 seconds for 2.3m calls to _interface_coptimizations.SpecificationBase.providedBy
1.647 seconds for 2.1m calls to zope.interface.interface.get

A big offender was table cells and once I'd disabled the processStyle method on the TableCell flowable the times dropped to the following:

Reading the XML and building the flowables: 3.128 seconds
Rendering the flowables with ReportLab: 3.104 seconds

I wasn't expecting a change from ReportLab but apparently having no style instead of a copy of the default style is much quicker. I then changed processStyle to only run getAttributes and apply the style if the cell actually has the attribute in the first place.

I wonder if improvements can be made to every element by changing RMLDirective.getAttributeValues?

Revision history for this message
Kyle MacFarlane (kyle-deletethetrees) wrote :
Revision history for this message
Stephan Richter (srichter-o) wrote :

Hey Kyle, z3c.rml is now on GitHub, so I would love to get Pull Requests for all of your fixes. this way, we can review and discuss them. Also, if you would like to contribute to z3c.rml, would you consider becoming a ZF committer? Then you could make the changes directly to the code base. My only requirements are that all the tests pass and that we keep test coverage close to 100% or whereever it is now.

https://github.com/zopefoundation/z3c.rml

Revision history for this message
Kyle MacFarlane (kyle-deletethetrees) wrote :

Hi Stephan. I pushed my changes to a fork on Github but I'm not sure how to bundle the pull requests together.

Buildout was suffering from dependency hell so I had to change some versions to get it to even install everything. Then quite a few tests were already failing (mostly due to just missing expected PDFs). I fixed all that and got all the tests to pass before adding any new features or other bug fixes but if I split it up into too many pull requests I think the order in which they are applied will effect the tests. The following 3 pull requests would probably be safest:

1) A huge one with all the buildout and test fixes, font fix, namedString, evalString, table borders, table cell performance, etc.

2) One for SVGs (quite different to the patch I made here).

3) One for printScaling.

I tried to create a test for everything I changed and currently everything is passing.

Revision history for this message
Stephan Richter (srichter-o) wrote :

Hi Kyle, that sounds like a plan. I understand that it would be hard to split the first pull request. No problem!.

Revision history for this message
Stephan Richter (srichter-o) wrote :

Kyle, could you send me your really large RML file, so I can profile it? There are several caching opportunities in getAttributeValues(), but I need to profile the code to see what the best approach would be.

Regards,
Stephan

Changed in z3c.rml:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.