handling of Specification result sets with 3000 rows is slow

Bug #632145 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Storm
New
Undecided
Unassigned

Bug Description

A query of ours in Launchpad returns 3000 Specification objects.

They are defined like this:
class Specification(SQLBase, BugLinkTargetMixin):
    """See ISpecification."""

    implements(ISpecification, IBugLinkTarget)

    _defaultOrder = ['-priority', 'definition_status', 'name', 'id']

    # db field names
    name = StringCol(unique=True, notNull=True)
    title = StringCol(notNull=True)
    summary = StringCol(notNull=True)
    definition_status = EnumCol(
        schema=SpecificationDefinitionStatus, notNull=True,
        default=SpecificationDefinitionStatus.NEW)
    priority = EnumCol(schema=SpecificationPriority, notNull=True,
        default=SpecificationPriority.UNDEFINED)
    assignee = ForeignKey(dbName='assignee', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    drafter = ForeignKey(dbName='drafter', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    approver = ForeignKey(dbName='approver', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    owner = ForeignKey(
        dbName='owner', foreignKey='Person',
        storm_validator=validate_public_person, notNull=True)
    datecreated = UtcDateTimeCol(notNull=True, default=DEFAULT)
    private = BoolCol(notNull=True, default=False)
    product = ForeignKey(dbName='product', foreignKey='Product',
        notNull=False, default=None)
    productseries = ForeignKey(dbName='productseries',
        foreignKey='ProductSeries', notNull=False, default=None)
    distribution = ForeignKey(dbName='distribution',
        foreignKey='Distribution', notNull=False, default=None)
    distroseries = ForeignKey(dbName='distroseries',
        foreignKey='DistroSeries', notNull=False, default=None)
    goalstatus = EnumCol(schema=SpecificationGoalStatus, notNull=True,
        default=SpecificationGoalStatus.PROPOSED)
    goal_proposer = ForeignKey(dbName='goal_proposer', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    date_goal_proposed = UtcDateTimeCol(notNull=False, default=None)
    goal_decider = ForeignKey(dbName='goal_decider', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    date_goal_decided = UtcDateTimeCol(notNull=False, default=None)
    milestone = ForeignKey(dbName='milestone',
        foreignKey='Milestone', notNull=False, default=None)
    specurl = StringCol(notNull=False, default=None)
    whiteboard = StringCol(notNull=False, default=None)
    direction_approved = BoolCol(notNull=True, default=False)
    man_days = IntCol(notNull=False, default=None)
    implementation_status = EnumCol(
        schema=SpecificationImplementationStatus, notNull=True,
        default=SpecificationImplementationStatus.UNKNOWN)
    superseded_by = ForeignKey(dbName='superseded_by',
        foreignKey='Specification', notNull=False, default=None)
    completer = ForeignKey(dbName='completer', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    date_completed = UtcDateTimeCol(notNull=False, default=None)
    starter = ForeignKey(dbName='starter', notNull=False,
        foreignKey='Person',
        storm_validator=validate_public_person, default=None)
    date_started = UtcDateTimeCol(notNull=False, default=None)

    # useful joins
    mentoring_offers = SQLMultipleJoin(
            'MentoringOffer', joinColumn='specification', orderBy='id')
    subscriptions = SQLMultipleJoin('SpecificationSubscription',
        joinColumn='specification', orderBy='id')
    subscribers = SQLRelatedJoin('Person',
        joinColumn='specification', otherColumn='person',
        intermediateTable='SpecificationSubscription',
        orderBy=['displayname', 'name'])
    feedbackrequests = SQLMultipleJoin('SpecificationFeedback',
        joinColumn='specification', orderBy='id')
    sprint_links = SQLMultipleJoin('SprintSpecification', orderBy='id',
        joinColumn='specification')
    sprints = SQLRelatedJoin('Sprint', orderBy='name',
        joinColumn='specification', otherColumn='sprint',
        intermediateTable='SprintSpecification')
    bug_links = SQLMultipleJoin(
        'SpecificationBug', joinColumn='specification', orderBy='id')
    bugs = SQLRelatedJoin('Bug',
        joinColumn='specification', otherColumn='bug',
        intermediateTable='SpecificationBug', orderBy='id')
    linked_branches = SQLMultipleJoin('SpecificationBranch',
        joinColumn='specification',
        orderBy='id')
    spec_dependency_links = SQLMultipleJoin('SpecificationDependency',
        joinColumn='specification', orderBy='id')

    dependencies = SQLRelatedJoin('Specification', joinColumn='specification',
        otherColumn='dependency', orderBy='title',
        intermediateTable='SpecificationDependency')
    blocked_specs = SQLRelatedJoin('Specification', joinColumn='dependency',
        otherColumn='specification', orderBy='title',
        intermediateTable='SpecificationDependency')

The query takes 500ms to run on the server (ok, thats a little slow) but takes between 10000ms and 12000ms to iterate in Python
kcachgrind says the following (%relative to parent)
calls method time-%-parent
3076 _load_object 87%
3072 get_obj_info 43%
3072 _add_to_alive 4%
3072 _set_values 33%

bug 618367 has the bug report in Launchpad with some details about this (but its about the page performance per se, so I may be closing that bug if I find a workaround / make the page better some other way.

Revision history for this message
Robert Collins (lifeless) wrote :

I've been doing some investigation and I'm going to narrow this bug right down - getting rid of all the joins still gives us a 10-12 second (under lsprof) parse time for a 3000 row result set. I'm attaching the kcachegrind which shows this pretty clearly.

To reproduce, grab lp db-stable/devel - and do a query for Specifications which returns 3000 of them; for some reason thats really quite slow - 4ms each or thereabouts.

summary: - handling of result sets with duplicate objects is slow
+ handling of Specification result sets with 3000 rows is slow
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

So one factor - 3000 calls to _set_values turn into 50K calls to set_variable.
And 100000K calls to checkpoint.

Now profiles in python do penalise method calls a bit, but even so, an OOPS without profiling shows the same time gap, so I think the profile is likely reasonable in this case.

Revision history for this message
Robert Collins (lifeless) wrote :

Another data point - one i speculated about in my perf-tuesday mail to the lp dev list, is that the overhead we are seeing has to do with cache updates and other infrastructure for supporting & generating *writes* to the DB : we'd like to bypass that completely, if thats the case: these are readonly requests where we are instantiating all these objects, that overhead is (in this context) waste.

Revision history for this message
Robert Collins (lifeless) wrote :

looking through the profile some more
1297 of the calls to get_obj_info trigger launchpads enumcol.__init__
these 1297 calls represent 10% of the runtime.
From the 1297 calls, 12333 calls were made to enumcol.__init__, or (~10 each time).
All but two of those calls made it into parse_set; that resulted in 7559 calls to zope's security proxy framework - which means (here is the code)
    def parse_set(self, value, from_db):
        if from_db:
            return self._enum.items[value]
        else:
            if not zope_isinstance(value, DBItem):
                raise TypeError("Not a DBItem: %r" % (value,))
            if self._enum != value.enum:
                raise TypeError("DBItem from wrong type, %r != %r" % (
                        self._enum.name, value.enum.name))
            return value

that 7559 of the calls to enumcol.__init__ were not for from_db objects.

Now, I suspect that this is because many rows will have the same enums; I wonder if we perhaps we'd save nontrivially by actually using Int's there? Or perhaps we should make a C version of enums?

Revision history for this message
Robert Collins (lifeless) wrote :

FWIW there are four EnumCol objects on specification:

    definition_status = EnumCol(
        schema=SpecificationDefinitionStatus, notNull=True,
        default=SpecificationDefinitionStatus.NEW)

    priority = EnumCol(schema=SpecificationPriority, notNull=True,
        default=SpecificationPriority.UNDEFINED)

    goalstatus = EnumCol(schema=SpecificationGoalStatus, notNull=True,
        default=SpecificationGoalStatus.PROPOSED)

    implementation_status = EnumCol(
        schema=SpecificationImplementationStatus, notNull=True,
        default=SpecificationImplementationStatus.UNKNOWN)

Revision history for this message
Robert Collins (lifeless) wrote :

So, the trace strongly suggests areas of inefficiency, but on my laptop which is latest-gen hardware 3000 specs only take a second - my machine /might/ be 10 times faster than staging, which would make this rather moot : but even at a second its still slower than desirable. I will dig deeper.

>>> timer = timeit.Timer('[(row.approverID, row.assigneeID, row.drafterID) for row in resultset]', 'from canonical.launchpad.interfaces.lpstorm import IMasterStore; from lp.blueprints.model.specification import Specification; store=IMasterStore(Specification); store.reset(); resultset = store.find(Specification)')
>>> timer.repeat(3,1)
[1.2493259906768799, 1.3308241367340088, 1.197606086730957]

from my machine (3000 specs).

Revision history for this message
Robert Collins (lifeless) wrote :

For use on a lp dataset:

timer = timeit.Timer('[(row.approverID, row.assigneeID, row.drafterID) for row in resultset]', 'from canonical.launchpad.interfaces.lpstorm import IMasterStore; from lp.blueprints.model.specification import Specification; store=IMasterStore(Specification); store.reset(); resultset = store.find(Specification, Specification.id < 3000)')

Revision history for this message
Robert Collins (lifeless) wrote :

12:19 < spm> lifeless: [43.454903841018677, 5.9508621692657471, 5.5013749599456787] & just again: [4.1124029159545898, 3.8747930526733398, 4.352078914642334]

from staging; the 43 is probably seeing DB again (check-teamparticipiation is a db scanning hog :P) - we're seeing 4-5 seconds to process 2800 rows, which is a lot better than 10 seconds, but still well below what we need.

Revision history for this message
Robert Collins (lifeless) wrote :

So, we're having load issues on staging - bug 633713

This is distorting the reporting and profiling; I suspect we'll be just closing this bug - I'm sure there is room to improve here. Some stats chasing the python defaults case James suggested:
with default= on our column definitions:

14:44 < spm> [1.9420769214630127, 1.9923639297485352, 1.8874919414520264] [1.9391319751739502, 1.9590089321136475, 1.9437789916992188] [1.9612140655517578, 2.0064890384674072, 2.0046439170837402]

Without default=

14:47 < spm> [1.8994817733764648, 2.0047039985656738, 2.034653902053833, 2.0163819789886475, 1.8003358840942383, 1.923363208770752, 1.8516139984130859, 1.9805231094360352, 1.8612921237945557]

e.g. very fast. Am now going to get profiles with no load and see how it looks.

Revision history for this message
Robert Collins (lifeless) wrote :

So, with almost no load, in the appserver, we're seeing 6 seconds between receiving the raw result and issuing the next query: and the next query is immediate, directly after a listification of the former.
https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1713S93
We know the base component can do it in 2 seconds, but something is adding another 4000ms of overhead, or 1.3ms per object, when in the profiler.

However,
https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1713S98
shos that without the profilers, its still 3.5 seconds to do:
list(specs)
for spec in specs:
   pull out the person IDs
compile the next query

And I trust the profiler in giving us relative costs here.
So I think we definitely need to improve, the problem is going to be - what.

Its also possible that the multithreaded environment we're in in appservers is punishing what storm does in some fashion causing unnecessary delays. Am going to get per-thread CPU times into OOPS so see can do some simple subtraction to see.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.