BranchMergeProposal:+index timeouts - slow query plan

Bug #742916 reported by Robert Collins on 2011-03-26
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Critical
Unassigned

Bug Description

OOPS-1910H1993
https://code.launchpad.net/%7Edanilo/launchpad/bug-474874/+merge/14450/+index

55. 1171 9986ms SQL-launchpad-main-slave SELECT BranchRevision.branch,
       BranchRevision.revision,
       BranchRevision.sequence,
       Revision.date_created,
       Revision.gpgkey,
       Revision.id,
       Revision.karma_allocated,
       Revision.log_body,
       Revision.revision_author,
       Revision.revision_date,
       Revision.revision_id
FROM BranchRevision,
     Revision
WHERE BranchRevision.branch = 273968
  AND BranchRevision.sequence IS NOT NULL
  AND BranchRevision.revision = Revision.id
  AND Revision.revision_date >= '2009-11-05 00:21:45.372993+00:00'
  AND Revision.revision_date <= '2009-11-06 15:52:25.310401+00:00'
ORDER BY BranchRevision.sequence;
                                                                                           QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort (cost=52988.39..52997.98 rows=3837 width=216) (actual time=23147.438..23147.438 rows=0 loops=1)
   Sort Key: branchrevision.sequence
   Sort Method: quicksort Memory: 25kB
   -> Hash Join (cost=25279.66..52759.98 rows=3837 width=216) (actual time=23147.425..23147.425 rows=0 loops=1)
         Hash Cond: (branchrevision.revision = revision.id)
         -> Index Scan using revision__branch__revision__key on branchrevision (cost=0.00..27384.39 rows=3837 width=12) (actual time=93.622..549.485 rows=9833 loops=1)
               Index Cond: (branch = 273968)
               Filter: (sequence IS NOT NULL)
         -> Hash (cost=25155.69..25155.69 rows=9918 width=204) (actual time=22583.927..22583.927 rows=9722 loops=1)
               -> Index Scan using revision__revision_date__idx on revision (cost=0.00..25155.69 rows=9918 width=204) (actual time=53.930..22557.397 rows=9722 loops=1)
                     Index Cond: ((revision_date >= '2009-11-05 00:21:45.372993'::timestamp without time zone) AND (revision_date <= '2009-11-06 15:52:25.310401'::timestamp without time zone))
 Total runtime: 23147.594 ms
(12 rows)

Time: 23283.795 ms

Robert Collins (lifeless) wrote :

This depends on hot disk cache to perform well - we're bringing back 10K rows from both tables to do the date filtering.

It may be possible to change the query, or we could pull revision_date into branchrevision

tags: added: dba
Aaron Bentley (abentley) wrote :

We could find the revno of the newest revision older than $DATE and then retrieve all BranchRevisions with a higher revno. Would that be more efficient?

Robert Collins (lifeless) wrote :

That sounds plausible yes.

Robert Collins (lifeless) wrote :

Another possibility would be a recursive query. I've no intuition about what would be more efficient.

tags: removed: dba
Stuart Bishop (stub) wrote :

I suspect the query is broken.

Revision.revision_date comes from the system timestamp where the revision was committed. With clock problems and merging revisions from other branches, the revision_date is not always in order. To retrieve a set of revisions that landed on a branch during a particular timeframe, we need to remember the min and max of BranchRevision.sequence during that timeframe. Or store BranchRevision.date_created as the time the branch scanner loaded the BranchRevision record into the database.

Gary Poster (gary) wrote :
Download full text (3.9 KiB)

I looked into Aaron's idea for comment #2. Here's the implementation I tried.

EXPLAIN ANALYZE WITH first AS (
  SELECT min(Revision.id) AS id
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 152968 AND Revision.revision_date >= '2009-08-10 21:56:30.623339+00:00'
), last AS (
  SELECT max(Revision.id) AS id
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 152968 AND Revision.revision_date <= '2009-08-11 23:51:08.021106+00:00'
)
SELECT
       BranchRevision.branch,
       BranchRevision.revision,
       BranchRevision.sequence,
       Revision.date_created,
       Revision.gpgkey,
       Revision.id,
       Revision.karma_allocated,
       Revision.log_body,
       Revision.revision_author,
       Revision.revision_date,
       Revision.revision_id
FROM BranchRevision,
     Revision
WHERE BranchRevision.branch = 152968
  AND BranchRevision.sequence IS NOT NULL
  AND BranchRevision.revision = Revision.id
  AND BranchRevision.revision >= (SELECT id FROM first)
  AND BranchRevision.revision <= (SELECT id FROM last);

 Nested Loop (cost=363833.91..394623.20 rows=22 width=215) (actual time=1.974..1.980 rows=1 loops=1)
   CTE first
     -> Aggregate (cost=181916.93..181916.94 rows=1 width=4) (actual time=0.928..0.928 rows=1 loops=1)
           -> Nested Loop (cost=0.00..181862.47 rows=21781 width=4) (actual time=0.883..0.922 rows=1 loops=1)
                 -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..30513.07 rows=21781 width=4) (actual time=0.012..0.102 rows=103 loops=1)
                       Index Cond: (branch = 152968)
                 -> Index Scan using changeset_pkey on revision (cost=0.00..6.94 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=103)
                       Index Cond: (public.revision.id = public.branchrevision.revision)
                       Filter: (public.revision.revision_date >= '2009-08-10 21:56:30.623339'::timestamp without time zone)
   CTE last
     -> Aggregate (cost=181916.93..181916.94 rows=1 width=4) (actual time=0.910..0.911 rows=1 loops=1)
           -> Nested Loop (cost=0.00..181862.47 rows=21781 width=4) (actual time=0.027..0.866 rows=103 loops=1)
                 -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..30513.07 rows=21781 width=4) (actual time=0.012..0.072 rows=103 loops=1)
                       Index Cond: (branch = 152968)
                 -> Index Scan using changeset_pkey on revision (cost=0.00..6.94 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=103)
                       Index Cond: (public.revision.id = public.branchrevision.revision)
                       Filter: (public.revision.revision_date <= '2009-08-11 23:51:08.021106'::timestamp without time zone)
   InitPlan 3 (returns $2)
     -> CTE Scan on first (cost=0.00..0.02 rows=1 width=4) (actual time=0.932..0.933 rows=1 loops=1)
   InitPlan 4 (returns $3)
     -> CTE Scan on last (cost=0.00..0.02 rows=1 width=4) (actual time=0.911..0.912 rows=1 loops=1)
   -> Index Scan using revisionnumber_branch_seq...

Read more...

Robert Collins (lifeless) wrote :

So, how to read:

Nested Loop (cost=0.00..181862.47 rows=21781 width=4) (actual time=0.883..0.922 rows=1 loops=1)

The first bracketed expression is the estimated work to happen: in this case '20K rows to examine, each row 4 units wide, starting at the beginning of the query (0.00) and going for 181K units [costs on the same machine are comparable, but they are not in units of time unless some -very- careful calibration is done].

The second bracketed expression is the actual sampled data. In this case, the work started 0.8ms into the query and finished at 0.9ms. 1 row was returned and one loop executed.

Robert Collins (lifeless) wrote :

I don't know enough about our schema here to know if we can assume monotonically increasing revision ids mapping to graph range queries. I think we can't.

Aaron was suggesting a revno - aka. 'BranchRevision.sequence' constraint, not a revision constraint.

That said, its worth looking at the performance of this query.

It has three parts - two CTE's and the main body.

 CTE first - estimated to find one row but examine up to 20K. cold cache we seem to spend about 2ms per cold row, so thats up to 40 seconds cold (based on planner stats).
Its doing an index scan on revisionnumber_branch_sequence_unique which is (branch, sequence).
This considered 103 rows.
For each row it looked up the revision id directly to check the date. - it was fast because only 103 rows existed in the branch: its a very small branch :)

CTE last is very similar.

So the costs for both first and last in your formulation are potentially very high.

Robert Collins (lifeless) wrote :
Download full text (3.2 KiB)

Branch 492361 is a recent branch of lp available on qastaging;

Running just the first CTE against it (with an old date - but this is what long running merge proposals will do, so its not unreasonable to assess its performance):

explain analyze SELECT min(Revision.id) AS id
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 492361 AND Revision.revision_date >= '2009-08-10 21:56:30.623339+00:00';

                                                                                    QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=176644.63..176644.64 rows=1 width=4) (actual time=109560.941..109560.942 rows=1 loops=1)
   -> Nested Loop (cost=0.00..176591.63 rows=21199 width=4) (actual time=4732.780..109534.701 rows=40649 loops=1)
         -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..29518.34 rows=21199 width=4) (actual time=60.252..995.287 rows=105756 loops=1)
               Index Cond: (branch = 492361)
         -> Index Scan using changeset_pkey on revision (cost=0.00..6.93 rows=1 width=4) (actual time=1.024..1.025 rows=0 loops=105756)
               Index Cond: (revision.id = branchrevision.revision)
               Filter: (revision.revision_date >= '2009-08-10 21:56:30.623339'::timestamp without time zone)
 Total runtime: 109561.079 ms

-> nearly 2 minutes.

Branch revision cannot filter on date; and revision cannot be selective on branch, which is why the db is (sensibly) picking the most selective thing it has (branchrevision.branch=branch) and working from there.

Hot, this is much faster:

explain analyze SELECT min(Revision.id) AS id
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 492361 AND Revision.revision_date >= '2009-08-10 21:56:30.623339+00:00';
                                                                                   QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=176644.63..176644.64 rows=1 width=4) (actual time=849.188..849.189 rows=1 loops=1)
   -> Nested Loop (cost=0.00..176591.63 rows=21199 width=4) (actual time=100.800..840.951 rows=40649 loops=1)
         -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..29518.34 rows=21199 width=4) (actual time=0.147..72.145 rows=105756 loops=1)
               Index Cond: (branch = 492361)
         -> Index Scan using changeset_pkey on revision (cost=0.00..6.93 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=105756)
               Index Cond: (revision.id = branchrevision.revision)
               Filter: (revision.revision_date >= '2009-08-10 21:56:30.623339'::timestamp without time zone)...

Read more...

Robert Collins (lifeless) wrote :

Aaron's suggestion was:
' We could find the revno of the newest revision older than $DATE and then retrieve all BranchRevisions with a higher revno.'

So, that revno from first principles is:
SELECT max(branchrevision.sequence) as revno
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 492361 AND branchrevision.sequence is not null and Revision.revision_date < '2009-08-10 21:56:30.623339+00:00';

This is still expensive - note the estimated cost (and slow hot performance - nearly 1 second).
 explain analyze SELECT max(branchrevision.sequence) as revno
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 492361 AND Revision.revision_date < '2009-08-10 21:56:30.623339+00:00';
                                                                                   QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=176644.63..176644.64 rows=1 width=4) (actual time=906.960..906.961 rows=1 loops=1)
   -> Nested Loop (cost=0.00..176591.63 rows=21199 width=4) (actual time=0.068..891.013 rows=65107 loops=1)
         -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..29518.34 rows=21199 width=8) (actual time=0.042..66.305 rows=105756 loops=1)
               Index Cond: (branch = 492361)
         -> Index Scan using changeset_pkey on revision (cost=0.00..6.93 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=105756)
               Index Cond: (revision.id = branchrevision.revision)
               Filter: (revision.revision_date < '2009-08-10 21:56:30.623339'::timestamp without time zone)
 Total runtime: 907.041 ms

you can see the plan is nearly identical - all

Robert Collins (lifeless) wrote :

bah, wrong explain in my prior comment - here is the right one:

explain analyze SELECT max(branchrevision.sequence) as revno
  FROM Revision JOIN BranchRevision ON BranchRevision.revision = Revision.id
  WHERE BranchRevision.branch = 492361 AND branchrevision.sequence is not null and Revision.revision_date < '2009-08-10 21:56:30.623339+00:00';
                                                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=61101.68..61101.69 rows=1 width=4) (actual time=197.653..197.654 rows=1 loops=1)
   -> Nested Loop (cost=0.00..61091.07 rows=4245 width=4) (actual time=0.068..194.178 rows=9084 loops=1)
         -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..29518.34 rows=4245 width=8) (actual time=0.043..70.559 rows=13262 loops=1)
               Index Cond: (branch = 492361)
               Filter: (sequence IS NOT NULL)
         -> Index Scan using changeset_pkey on revision (cost=0.00..7.43 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=13262)
               Index Cond: (revision.id = branchrevision.revision)
               Filter: (revision.revision_date < '2009-08-10 21:56:30.623339'::timestamp without time zone)
 Total runtime: 197.746 ms

This is half the estimated cost, which is an improvement - and 20% of the measured hot cost, which suggests quite some scattered data.

Robert Collins (lifeless) wrote :

Anyhow, tying that back into Aaron's question - no, its not cheaper than what we do today (because the IS NOT NULL optimisation can be applied today) - the logic to determine 'highest sequence number with a date older than $proposal state' itself involves loading -all- revisions in the mainline.

Robert Collins (lifeless) wrote :
Download full text (3.2 KiB)

this is a recursive implementation - loads as many revs as are needed - and could be trivially tweaked to return the referenced revisions inline..
 explain analyze WITH RECURSIVE mp_revs AS (
    select sequence as sequence, revision as revision from branchrevision where branch=492361 and sequence = (select max(sequence) from branchrevision where branch=492361)
  UNION
   select branchrevision.sequence, branchrevision.revision from mp_revs, branchrevision, revision where branchrevision.branch=492361 and branchrevision.revision=revision.id and branchrevision.sequence=mp_revs.sequence-1 and Revision.revision_date >= '2009-08-10 21:56:30.623339+00:00'
)
SELECT sequence, revision FROM mp_revs;
                                                                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on mp_revs (cost=12367.71..12368.13 rows=21 width=8) (actual time=46.900..161.402 rows=4178 loops=1)
   CTE mp_revs
     -> Recursive Union (cost=1.40..12367.71 rows=21 width=8) (actual time=46.896..156.775 rows=4178 loops=1)
           -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=1.40..122.42 rows=1 width=8) (actual time=46.887..46.889 rows=1 loops=1)
                 Index Cond: ((branch = 492361) AND (sequence = $2))
                 InitPlan 2 (returns $2)
                   -> Result (cost=1.39..1.40 rows=1 width=0) (actual time=46.854..46.854 rows=1 loops=1)
                         InitPlan 1 (returns $1)
                           -> Limit (cost=0.00..1.39 rows=1 width=4) (actual time=46.845..46.846 rows=1 loops=1)
                                 -> Index Scan Backward using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..29518.34 rows=21199 width=4) (actual time=46.843..46.843 rows=1 loops=1)
                                       Index Cond: (branch = 492361)
                                       Filter: (sequence IS NOT NULL)
           -> Nested Loop (cost=0.00..1224.49 rows=2 width=8) (actual time=0.021..0.024 rows=1 loops=4178)
                 -> Nested Loop (cost=0.00..1210.61 rows=2 width=8) (actual time=0.011..0.013 rows=1 loops=4178)
                       -> WorkTable Scan on mp_revs (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.001 rows=1 loops=4178)
                       -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..121.03 rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=4178)
                             Index Cond: ((public.branchrevision.branch = 492361) AND (public.branchrevision.sequence = (mp_revs.sequence - 1)))
                 -> Index Scan using changeset_pkey on revision (cost=0.00..6.93 rows=1 width=4) (actual time=0.008..0.009 rows=1 loops=4178)
                       Index Cond: (revision.id = public.branchrevision.revision)
                       Filter: (revision.revisi...

Read more...

Robert Collins (lifeless) wrote :
Download full text (3.4 KiB)

Switching to staging, for cold cache performance - because recursive estimates are probably not trustworthy...

explain analyze WITH RECURSIVE mp_revs AS (
    select sequence as sequence, revision as revision from branchrevision where branch=492361 and sequence = (select max(sequence) from branchrevision where branch=492361)
  UNION
   select branchrevision.sequence, branchrevision.revision from mp_revs, branchrevision, revision where branchrevision.branch=492361 and branchrevision.revision=revision.id and branchrevision.sequence=mp_revs.sequence-1 and Revision.revision_date >= '2009-08-10 21:56:30.623339+00:00'
)
SELECT sequence, revision FROM mp_revs;
                                                                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on mp_revs (cost=13155.48..13155.90 rows=21 width=8) (actual time=1979.814..25504.341 rows=4180 loops=1)
   CTE mp_revs
     -> Recursive Union (cost=1.41..13155.48 rows=21 width=8) (actual time=1979.810..25497.654 rows=4180 loops=1)
           -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=1.41..130.23 rows=1 width=8) (actual time=1979.791..1979.793 rows=1 loops=1)
                 Index Cond: ((branch = 492361) AND (sequence = $2))
                 InitPlan 2 (returns $2)
                   -> Result (cost=1.40..1.41 rows=1 width=0) (actual time=1976.047..1976.048 rows=1 loops=1)
                         InitPlan 1 (returns $1)
                           -> Limit (cost=0.00..1.40 rows=1 width=4) (actual time=1976.037..1976.038 rows=1 loops=1)
                                 -> Index Scan Backward using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..30513.07 rows=21781 width=4) (actual time=1976.034..1976.034 rows=1 loops=1)
                                       Index Cond: (branch = 492361)
                                       Filter: (sequence IS NOT NULL)
           -> Nested Loop (cost=0.00..1302.48 rows=2 width=8) (actual time=5.606..5.621 rows=1 loops=4180)
                 -> Nested Loop (cost=0.00..1288.59 rows=2 width=8) (actual time=0.023..0.025 rows=1 loops=4180)
                       -> WorkTable Scan on mp_revs (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.001 rows=1 loops=4180)
                       -> Index Scan using revisionnumber_branch_sequence_unique on branchrevision (cost=0.00..128.82 rows=1 width=8) (actual time=0.020..0.021 rows=1 loops=4180)
                             Index Cond: ((public.branchrevision.branch = 492361) AND (public.branchrevision.sequence = (mp_revs.sequence - 1)))
                 -> Index Scan using changeset_pkey on revision (cost=0.00..6.94 rows=1 width=4) (actual time=5.581..5.593 rows=1 loops=4180)
                       Index Cond: (revision.id = public.branchrevision.revision)
                       Filter: (revision.revision_...

Read more...

Gary Poster (gary) wrote :

Thank you, Robert. That was more than I was hoping for!

The recursive solution is cool, but AFAICT it does not meaningfully solve the bug. It is 25.5 seconds from a cold cache on staging. The original bug description is concerned about a 23.2 second query on production, also from a cold cache.

This really is an interesting bug. For other timeouts, we are usually more interested in warm cache problems. This bug points out that we will not have zero OOPS until cold cache lookups are fast too. That's a much higher standard than we've had to deal with so far (and in fact a higher standard than I've ever used before, in any project, perhaps sadly). Another bug I just looked at, bug 787294, is "fixed" because a warm cache lookup is now under our thresholds. If we had applied a cold cache constraint to that query, there's no way it would have met the desired characteristics--and solving it would have needed much more thought and work, I suspect.

For this particular bug, it does look like we need a schema change to fix it.

I'll keep this bug in my back pocket for now, and maybe return to it later if no-one else has tackled it--and meanwhile I'll raise the point about hot cache and cold cache OOPSes on the list.

Martin Pool (mbp) wrote :

Robert suggested this might be the same bug: trying to link a related branch to https://bugs.launchpad.net/launchpad-dev-moin-theme/+bug/865977 times out repeatedly on lpnet.

The ajax time pulldown says

Time: 7.53 seconds
ID: 3, status: 200 (Ok)
Time: 1.43 seconds
ID: 2, status: 200 (Ok)
Time: 1.00 seconds
ID: 1, status: 200 (Ok)

(it's a bit surprising to me it's a timeout if it's a 200 code.)

Martin Pool (mbp) wrote :
Curtis Hovey (sinzui) on 2012-11-29
tags: added: branches
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers