DB unfriendly implementation of RevisionAuthor email linker

Bug #391033 reported by Stuart Bishop
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Stuart Bishop

Bug Description

The current implementation of RevisionAuthorEmailLinker in garbo.py does not evenly distribute work, causingsome transactions to run of 80 or 90 seconds. It is also very inefficient.

A better implementation would be to calculate the set of linkable RevisionAuthor records, storing them in a temporary table. We can then perform the updates in chunks.

In __init__:

CREATE TEMPORARY TABLE LinkableRevisionAuthor (
    id serial PRIMARY KEY,
    revision_author integer,
    person integer);
INSERT INTO LinkableRevisionAuthor (revision_author, email_address)
SELECT RevisionAuthor.id, EmailAddress.person
WHERE RevisionAuthor.person IS NULL
AND lower(RevisionAuthor.email) = lower(EmailAddress.email)

In __call__:

UPDATE RevisionAuthor
SET RevisionAuthor.person = LinkableRevisionAuthor.person
USING (
    SELECT revision_author, person
    FROM LinkableRevisionAuthor
    WHERE id BETWEEN self.offset AND self.offset + chunk_size
    ) AS LinkableRevisionAuthor

Tags: lp-code
Stuart Bishop (stub)
Changed in launchpad-code:
assignee: nobody → Stuart Bishop (stub)
milestone: none → 2.2.7
status: New → Triaged
Tim Penhey (thumper)
Changed in launchpad-code:
milestone: 2.2.7 → 2.2.8
Revision history for this message
Tim Penhey (thumper) wrote :

Hi Stuart,

If this hasn't yet been done, or won't be done for 3.0, can you kick it to 3.1.10 please?

Changed in launchpad-code:
milestone: 2.2.8 → 3.0
Stuart Bishop (stub)
Changed in launchpad-code:
importance: Undecided → Medium
milestone: 3.0 → 3.1.10
Revision history for this message
Diogo Matsubara (matsubara) wrote :

Moved to 3.1.11 so I can close 3.1.10 milestone

Changed in launchpad-code:
milestone: 3.1.10 → 3.1.11
Tim Penhey (thumper)
Changed in launchpad-code:
milestone: 3.1.11 → 3.1.13
Curtis Hovey (sinzui)
Changed in launchpad-code:
milestone: 3.1.13 → 10.01
Revision history for this message
Tim Penhey (thumper) wrote :

Stuart, is this actually on your radar?

Changed in launchpad-code:
milestone: 10.01 → 10.02
Revision history for this message
Stuart Bishop (stub) wrote :

I'll get there eventually. I don't think this has been linked to any appserver timeouts, so it isn't a high priority.

Changed in launchpad-code:
milestone: 10.02 → none
Revision history for this message
Robert Collins (lifeless) wrote :

We have regular timeouts on branch/revision insertion, which provides the link to timeouts I think.

Changed in launchpad:
importance: Medium → High
Revision history for this message
Robert Collins (lifeless) wrote :

The other possible cause of timeouts is the scanner; which is at fault is unclear.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.