Upcoming work pages appear to time out a lot

Bug #1692120 reported by Mathieu Trudel-Lapierre
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
William Grant

Bug Description

My personal upcomingwork page seems to time out a whole lot (but I also have lots of stuff on it).


For instance: (Error ID: OOPS-3462b9b41acbf8d5afb00ba372201498)

Related branches

Revision history for this message
Colin Watson (cjwatson) wrote :

The offending query is:

SELECT DISTINCT SpecificationWorkItem.assignee, SpecificationWorkItem.date_created, SpecificationWorkItem.deleted, SpecificationWorkItem.id, SpecificationWorkItem.milestone, SpecificationWorkItem.sequence, SpecificationWorkItem.specification, SpecificationWorkItem.status, SpecificationWorkItem.title
FROM Specification
LEFT JOIN Product ON Specification.product = Product.id
JOIN SpecificationWorkItem ON SpecificationWorkItem.specification = Specification.id
JOIN Milestone ON COALESCE(SpecificationWorkItem.milestone, Specification.milestone) = Milestone.id
WHERE (Specification.product IS NULL
       OR Product.active = TRUE)
  AND (Specification.information_type IN (1,
       OR COALESCE((Specification.access_grants)&&
                     (SELECT ARRAY_AGG(TeamParticipation.team)
                      FROM TeamParticipation
                      WHERE TeamParticipation.person = 1557381), FALSE)
       OR COALESCE(ARRAY[(Specification.access_policy)]&&
                     (SELECT ARRAY_AGG(AccessPolicyGrant.policy)
                      FROM AccessPolicyGrant
                      JOIN TeamParticipation ON TeamParticipation.team = AccessPolicyGrant.grantee
                      WHERE TeamParticipation.person = 1557381), FALSE))
  AND Milestone.dateexpected <= E'2017-11-15'
  AND Milestone.dateexpected >= E'2017-05-19'
  AND SpecificationWorkItem.deleted = FALSE
  AND (SpecificationWorkItem.assignee IN (1557381)
       OR Specification.assignee IN (1557381))
ORDER BY SpecificationWorkItem.id;

I can't get good EXPLAIN ANALYZE output from dogfood on this since the database dump is a bit too old (and when I tweak the dates dogfood is an order of magnitude faster, so I don't want to put much faith in the results), so I'll have to wait until a weekday when we can get a sysadmin to try it on production.

Revision history for this message
Brian Murray (brian-murray) wrote :

This also happens to me.

Colin Watson (cjwatson)
Changed in launchpad:
status: New → Triaged
importance: Undecided → Critical
tags: added: lp-blueprints oops timeout
Revision history for this message
William Grant (wgrant) wrote :

The main problem is that the planner can't accurately estimate the number of workitems due to one of the assignee filters being on the wrong side of a join, causing it to end up retrieving all workitems and all specifications and filtering those down. This runs into a growing performance issue with ACL checks for Canonical employees that are in lots of teams.

https://pastebin.canonical.com/189802/ has three variants of the query which attempt to mitigate the issues. I suspect the last one will be <50ms.

tags: removed: oops
William Grant (wgrant)
Changed in launchpad:
assignee: nobody → William Grant (wgrant)
status: Triaged → In Progress
Revision history for this message
William Grant (wgrant) wrote :

EXPLAIN (ANALYZE, BUFFERS) results: https://pastebin.canonical.com/190056/

As anticipated, the planner works much better when the CTE optimisation barrier is added, forcing it to use indexes to calculate the set of workitems.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
William Grant (wgrant)
tags: added: qa-ok
removed: qa-needstesting
Colin Watson (cjwatson)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers