Distribution:+bugs-index time outs

Bug #618406 reported by Robert Collins
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Robert Collins

Bug Description

OOPS-1839G1380
https://bugs.launchpad.net/ubuntu/+bugs-index

11. 207 14060ms SQL-launchpad-main-slave
SELECT BugTask.assignee,
       BugTask.bug,
       BugTask.bugwatch,
       BugTask.date_assigned,
       BugTask.date_closed,
       BugTask.date_confirmed,
       BugTask.date_fix_committed,
       BugTask.date_fix_released,
       BugTask.date_incomplete,
       BugTask.date_inprogress,
       BugTask.date_left_closed,
       BugTask.date_left_new,
       BugTask.date_triaged,
       BugTask.datecreated,
       BugTask.distribution,
       BugTask.distroseries,
       BugTask.id,
       BugTask.importance,
       BugTask.milestone,
       BugTask.OWNER, BugTask.product,
                      BugTask.productseries,
                      BugTask.sourcepackagename,
                      BugTask.status,
                      BugTask.statusexplanation,
                      BugTask.targetnamecache,
                      Bug.date_last_message,
                      Bug.date_last_updated,
                      Bug.date_made_private,
                      Bug.datecreated,
                      Bug.description,
                      Bug.duplicateof,
                      Bug.heat,
                      Bug.heat_last_updated,
                      Bug.id,
                      Bug.latest_patch_uploaded,
                      Bug.message_count,
                      Bug.name,
                      Bug.number_of_duplicates,
                      Bug.OWNER, Bug.private,
                                 Bug.security_related,
                                 Bug.title,
                                 Bug.users_affected_count,
                                 Bug.users_unaffected_count,
                                 Bug.who_made_private
FROM BugTask
JOIN Bug ON BugTask.bug = Bug.id
WHERE Bug.id = BugTask.bug
  AND BugTask.distribution = 1
  AND ((BugTask.status = 10)
       OR (BugTask.status = 15)
       OR (BugTask.status = 20)
       OR (BugTask.status = 21)
       OR (BugTask.status = 22)
       OR (BugTask.status = 25))
  AND Bug.duplicateof IS NULL
  AND Bug.private = FALSE
ORDER BY Bug.heat DESC, BugTask.id LIMIT 40
OFFSET 0

Related branches

tags: added: pg83
Changed in launchpad:
importance: High → Critical
Revision history for this message
Micah Gersten (micahg) wrote :

Just got a timeout on https://bugs.launchpad.net/ubuntu
OOPS-1839G1380

description: updated
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (5.6 KiB)

https://bugs.launchpad.net/ubuntu is down to 7.21 seconds first hit, then 8, 9, 12, 8. The 12 is GIL contention on the server I think, given its capable of rendering much more quickly. The page id is still in my timeout-candidate PPR report - https://devpad.canonical.com/~lpqateam/ppr/lpnet/latest-daily-timeout-candidates.html - but I fear we'll need to wait and capture a fresher oops still since we've improved bug search substantially over the last week.

The bug heat search query is slowwww though:
Time: 52081.905 ms

explain analyze SELECT BugTask.assignee, BugTask.bug, BugTask.bugwatch, BugTask.date_assigned, BugTask.date_closed, BugTask.date_confirmed, BugTask.date_fix_committed, BugTask.date_fix_released, BugTask.date_incomplete, BugTask.date_inprogress, BugTask.date_left_closed, BugTask.date_left_new, BugTask.date_triaged, BugTask.datecreated, BugTask.distribution, BugTask.distroseries, BugTask.id, BugTask.importance, BugTask.milestone, BugTask.owner, BugTask.product, BugTask.productseries, BugTask.sourcepackagename, BugTask.status, BugTask.statusexplanation, BugTask.targetnamecache, Bug.date_last_message, Bug.date_last_updated, Bug.date_made_private, Bug.datecreated, Bug.description, Bug.duplicateof, Bug.heat, Bug.heat_last_updated, Bug.id, Bug.latest_patch_uploaded, Bug.message_count, Bug.name, Bug.number_of_duplicates, Bug.owner, Bug.private, Bug.security_related, Bug.title, Bug.users_affected_count, Bug.users_unaffected_count, Bug.who_made_private FROM BugTask JOIN Bug ON BugTask.bug = Bug.id WHERE Bug.id = BugTask.bug AND BugTask.distribution = 1 AND ((BugTask.status = 10) OR (BugTask.status = 15) OR (BugTask.status = 20) OR (BugTask.status = 21) OR (BugTask.status = 22) OR (BugTask.status = 25)) AND Bug.duplicateof is NULL AND (Bug.private = FALSE OR EXISTS ( SELECT BugSubscription.bug FROM BugSubscription, TeamParticipation WHERE TeamParticipation.person = 2 AND TeamParticipation.team = BugSubscription.person AND BugSubscription.bug = Bug.id UNION SELECT BugTask.bug FROM BugTask, TeamParticipation WHERE TeamParticipation.person = 2 AND TeamParticipation.team = BugTask.assignee AND BugTask.bug = Bug.id )) ORDER BY Bug.heat DESC, BugTask.id LIMIT 40 OFFSET 0;
                                                                                             QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=6957590.92..6957591.02 rows=40 width=1017) (actual time=4311.330..4311.343 rows=40 loops=1)
   -> Sort (cost=6957590.92..6958004.71 rows=165515 width=1017) (actual time=4311.328..4311.335 rows=40 loops=1)
         Sort Key: bug.heat, public.bugtask.id
         Sort Method: top-N heapsort Memory: 92kB
         -> Nested Loop (cost=0.00..6952359.05 rows=165515 width=1017) (actual time=0.076..4067.521 rows=88655 loops=1)
               -> Seq Scan on bugtask (cost=0.00..54723.29 rows=165515 width=276) (actual time=0.046..847.575 rows=190392 loops=1)
                 ...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

Dropping bugtask.id from the sort order makes this a 3ms query:

 Limit (cost=0.00..6939.56 rows=40 width=1017) (actual time=0.144..2.664 rows=40 loops=1)
   -> Nested Loop (cost=0.00..28715046.16 rows=165515 width=1017) (actual time=0.142..2.647 rows=40 loops=1)
         -> Index Scan Backward using bug__heat__idx on bug (cost=0.00..28409422.84 rows=555261 width=741) (actual time=0.075..0.473 rows=64 loops=1)
               Filter: ((duplicateof IS NULL) AND ((NOT private) OR (SubPlan 1)))
               SubPlan 1
                 -> Unique (cost=39.25..39.26 rows=2 width=4) (never executed)
                       -> Sort (cost=39.25..39.25 rows=2 width=4) (never executed)
                             Sort Key: bugsubscription.bug
                             -> Append (cost=0.00..39.24 rows=2 width=4) (never executed)
                                   -> Nested Loop (cost=0.00..26.48 rows=1 width=4) (never executed)
                                         -> Index Scan using bugsubscription_bug_idx on bugsubscription (cost=0.00..7.26 rows=3 width=8) (never executed)
                                               Index Cond: (bug = $0)
                                         -> Index Scan using teamparticipation_team_key on teamparticipation (cost=0.00..6.39 rows=1 width=4) (never executed)
                                               Index Cond: ((public.teamparticipation.team = bugsubscription.person) AND (public.teamparticipation.person = 2))
                                   -> Nested Loop (cost=0.00..12.74 rows=1 width=4) (never executed)
                                         -> Index Scan using bugtask__bug__idx on bugtask (cost=0.00..6.34 rows=1 width=8) (never executed)
                                               Index Cond: (bug = $0)
                                         -> Index Scan using teamparticipation_team_key on teamparticipation (cost=0.00..6.39 rows=1 width=4) (never executed)
                                               Index Cond: ((public.teamparticipation.team = public.bugtask.assignee) AND (public.teamparticipation.person = 2))
         -> Index Scan using bugtask__bug__idx on bugtask (cost=0.00..0.54 rows=1 width=276) (actual time=0.026..0.031 rows=1 loops=64)
               Index Cond: (public.bugtask.bug = bug.id)
               Filter: ((public.bugtask.distribution = 1) AND ((public.bugtask.status = 10) OR (public.bugtask.status = 15) OR (public.bugtask.status = 20) OR (public.bugtask.status = 21) OR (public.bugtask.status = 22) OR (public.bugtask.status = 25)))
 Total runtime: 3.002 ms

Anything sorting by heat shouldn't care about bugtask ids, so this is a no brainer - though we /may/ have some test fallout.

Revision history for this message
Robert Collins (lifeless) wrote :

It should be totally safe to use this for Ubuntu as a whole, and perhaps distroseries queries; I don't know if per package, per project searches have enough heat variation to behave nicely. However, I suggest we try just dropping the .id field in the order for heat ordered searches based on a feature flag. If it looks poor on real things, we can easily reinstate it, and then do a more complex index (e.g. a partial one on heat desc, importance desc, bug.id where status is an open status).

description: updated
description: updated
summary: - Distribution:+bugs-index timing out in ~2% of requests
+ Distribution:+bugs-index time outs
Stuart Bishop (stub)
description: updated
tags: removed: pg83
description: updated
Revision history for this message
Stuart Bishop (stub) wrote :

Indexes on Bug(id, heat) or Bug(heat, id) do not help.

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (6.3 KiB)

Ugh ;( - thats surprising - this is the plan I see with (heat, id desc) + a modified query (ORDER BY HEAT DESC, Bug.ID)

                                                                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..6948.98 rows=40 width=1017) (actual time=0.068..1.751 rows=40 loops=1)
   -> Nested Loop (cost=0.00..28763892.59 rows=165572 width=1017) (actual time=0.067..1.731 rows=40 loops=1)
         -> Index Scan Backward using bug_heat_test on bug (cost=0.00..28458137.93 rows=555476 width=741) (actual time=0.030..0.244 rows=67 loops=1)
               Filter: ((duplicateof IS NULL) AND ((NOT private) OR (SubPlan 1)))
               SubPlan 1
                 -> Unique (cost=39.25..39.26 rows=2 width=4) (never executed)
                       -> Sort (cost=39.25..39.25 rows=2 width=4) (never executed)
                             Sort Key: bugsubscription.bug
                             -> Append (cost=0.00..39.24 rows=2 width=4) (never executed)
                                   -> Nested Loop (cost=0.00..26.48 rows=1 width=4) (never executed)
                                         -> Index Scan using bugsubscription_bug_idx on bugsubscription (cost=0.00..7.26 rows=3 width=8) (never executed)
                                               Index Cond: (bug = $0)
                                         -> Index Scan using teamparticipation_team_key on teamparticipation (cost=0.00..6.39 rows=1 width=4) (never executed)
                                               Index Cond: ((public.teamparticipation.team = bugsubscription.person) AND (public.teamparticipation.person = 2))
                                   -> Nested Loop (cost=0.00..12.74 rows=1 width=4) (never executed)
                                         -> Index Scan using bugtask__bug__idx on bugtask (cost=0.00..6.34 rows=1 width=8) (never executed)
                                               Index Cond: (bug = $0)
                                         -> Index Scan using teamparticipation_team_key on teamparticipation (cost=0.00..6.39 rows=1 width=4) (never executed)
                                               Index Cond: ((public.teamparticipation.team = public.bugtask.assignee) AND (public.teamparticipation.person = 2))
         -> Index Scan using bugtask__bug__idx on bugtask (cost=0.00..0.54 rows=1 width=276) (actual time=0.016..0.020 rows=1 loops=67)
               Index Cond: (public.bugtask.bug = bug.id)
               Filter: ((public.bugtask.distribution = 1) AND ((public.bugtask.status = 10) OR (public.bugtask.status = 15) OR (public.bugtask.status = 20) OR (public.bugtask.status = 21) OR (public.bugtask.status = 22) OR (public.bugtask.status = 25)))
 Total runtime: 2.020 ms
(23 rows)

and for e.g. launchpad (this is...

Read more...

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
assignee: nobody → Robert Collins (lifeless)
milestone: none → 11.04
tags: added: qa-needstesting
Changed in launchpad:
status: Triaged → Fix Committed
Revision history for this message
Robert Collins (lifeless) wrote :

db patch takes 5 minutes to apply; we're good to go and can iterate in the appservers to improve after the deploy.

tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Robert Collins (lifeless) wrote :

The indices we added may be wrong; will need to run scenarios and get a good index + query.

for instance,
....
  AND Bug.private = FALSE and sourcepackagename is null
ORDER BY distribution desc, sourcepackagename desc, Bugtask.heat DESC LIMIT 40

is serviced by the query we added. (And whee - thats nice and fast - 2ms).

similarly, ordering by product desc with a constraint on product works.

Revision history for this message
Robert Collins (lifeless) wrote :

candidate missing indices (first one will replace bugtask__product__heat__idx)

CREATE INDEX bugtask__product__heat__id__idx ON bugtask USING btree (product, heat desc, id) WHERE product IS NOT NULL;
CREATE INDEX bugtask__distribution__head__id__idx ON bugtask USING btree(distribution, heat desc, id) WHERE bugtask IS NOT NULL;

Revision history for this message
Robert Collins (lifeless) wrote :

with the test indices added:

  AND Bug.private = FALSE ORDER BY product , Bugtask.heat desc, bugtask.id LIMIT 40
OFFSET 0
;
                                                                          QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..500.33 rows=40 width=1024) (actual time=0.042..0.784 rows=40 loops=1)
   -> Nested Loop (cost=0.00..101328.76 rows=8101 width=1024) (actual time=0.040..0.771 rows=40 loops=1)
         -> Index Scan using bugtask__product__heat__id__idx on bugtask (cost=0.00..62426.31 rows=8101 width=281) (actual time=0.025..0.272 rows=62 loops=1)
               Index Cond: (product = 10294)
               Filter: ((status = 10) OR (status = 15) OR (status = 20) OR (status = 21) OR (status = 22) OR (status = 25))
         -> Index Scan using bug_pkey on bug (cost=0.00..4.79 rows=1 width=743) (actual time=0.007..0.007 rows=1 loops=62)
               Index Cond: (bug.id = bugtask.bug)
               Filter: ((bug.duplicateof IS NULL) AND (NOT bug.private))
 Total runtime: 0.898 ms
(9 rows)

CREATE INDEX bugtask__distribution__heat__id__idx ON bugtask USING btree(distribution, heat desc, id) WHERE distribution IS NOT NULL

  AND Bug.private = FALSE ORDER BY distribution , Bugtask.heat DESC, bugtask.id LIMIT 40
OFFSET 0
;
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..124.33 rows=40 width=1024) (actual time=0.154..1.391 rows=40 loops=1)
   -> Nested Loop (cost=0.00..532274.29 rows=171251 width=1024) (actual time=0.153..1.372 rows=40 loops=1)
         -> Index Scan using bugtask__distribution__heat__id__idx on bugtask (cost=0.00..142123.90 rows=171251 width=281) (actual time=0.106..0.398 rows=42 loops=1)
               Index Cond: (distribution = 1)
               Filter: ((status = 10) OR (status = 15) OR (status = 20) OR (status = 21) OR (status = 22) OR (status = 25))
         -> Index Scan using bug_pkey on bug (cost=0.00..2.27 rows=1 width=743) (actual time=0.018..0.019 rows=1 loops=42)
               Index Cond: (bug.id = bugtask.bug)
               Filter: ((bug.duplicateof IS NULL) AND (NOT bug.private))
 Total runtime: 1.591 ms
(9 rows)

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
removed: qa-ok
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-ok
removed: qa-needstesting
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.