Upcoming work pages appear to time out a lot

Bug #1692120 reported by Mathieu Trudel-Lapierre on 2017-05-19
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Critical
William Grant

Bug Description

My personal upcomingwork page seems to time out a whole lot (but I also have lots of stuff on it).

https://launchpad.net/~cyphermox/+upcomingwork

For instance: (Error ID: OOPS-3462b9b41acbf8d5afb00ba372201498)

Related branches

Colin Watson (cjwatson) wrote :

The offending query is:

SELECT DISTINCT SpecificationWorkItem.assignee, SpecificationWorkItem.date_created, SpecificationWorkItem.deleted, SpecificationWorkItem.id, SpecificationWorkItem.milestone, SpecificationWorkItem.sequence, SpecificationWorkItem.specification, SpecificationWorkItem.status, SpecificationWorkItem.title
FROM Specification
LEFT JOIN Product ON Specification.product = Product.id
JOIN SpecificationWorkItem ON SpecificationWorkItem.specification = Specification.id
JOIN Milestone ON COALESCE(SpecificationWorkItem.milestone, Specification.milestone) = Milestone.id
WHERE (Specification.product IS NULL
       OR Product.active = TRUE)
  AND (Specification.information_type IN (1,
                                          2)
       OR COALESCE((Specification.access_grants)&&
                     (SELECT ARRAY_AGG(TeamParticipation.team)
                      FROM TeamParticipation
                      WHERE TeamParticipation.person = 1557381), FALSE)
       OR COALESCE(ARRAY[(Specification.access_policy)]&&
                     (SELECT ARRAY_AGG(AccessPolicyGrant.policy)
                      FROM AccessPolicyGrant
                      JOIN TeamParticipation ON TeamParticipation.team = AccessPolicyGrant.grantee
                      WHERE TeamParticipation.person = 1557381), FALSE))
  AND Milestone.dateexpected <= E'2017-11-15'
  AND Milestone.dateexpected >= E'2017-05-19'
  AND SpecificationWorkItem.deleted = FALSE
  AND (SpecificationWorkItem.assignee IN (1557381)
       OR Specification.assignee IN (1557381))
ORDER BY SpecificationWorkItem.id;

I can't get good EXPLAIN ANALYZE output from dogfood on this since the database dump is a bit too old (and when I tweak the dates dogfood is an order of magnitude faster, so I don't want to put much faith in the results), so I'll have to wait until a weekday when we can get a sysadmin to try it on production.

Brian Murray (brian-murray) wrote :

This also happens to me.

Colin Watson (cjwatson) on 2017-06-01
Changed in launchpad:
status: New → Triaged
importance: Undecided → Critical
tags: added: lp-blueprints oops timeout
William Grant (wgrant) wrote :

The main problem is that the planner can't accurately estimate the number of workitems due to one of the assignee filters being on the wrong side of a join, causing it to end up retrieving all workitems and all specifications and filtering those down. This runs into a growing performance issue with ACL checks for Canonical employees that are in lots of teams.

https://pastebin.canonical.com/189802/ has three variants of the query which attempt to mitigate the issues. I suspect the last one will be <50ms.

tags: removed: oops
William Grant (wgrant) on 2017-06-06
Changed in launchpad:
assignee: nobody → William Grant (wgrant)
status: Triaged → In Progress
William Grant (wgrant) wrote :

EXPLAIN (ANALYZE, BUFFERS) results: https://pastebin.canonical.com/190056/

As anticipated, the planner works much better when the CTE optimisation barrier is added, forcing it to use indexes to calculate the set of workitems.

Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
William Grant (wgrant) on 2017-06-06
tags: added: qa-ok
removed: qa-needstesting
Colin Watson (cjwatson) on 2017-06-07
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers