BugTask:+editstatus-page timing out in ~4% of requests

Bug #618403 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Deryck Hodge

Bug Description

BugTask:+editstatus-page is timing out routinely according to https://devpad.canonical.com/~stub/ppr/lpnet/latest-daily-timeout-candidates.html.

http://pad.lv/OOPS-1704ED939
https://bugs.edge.launchpad.net/hundredpapercuts/+bug/11334/+editstatus

hundredpapercuts.assignee.option: hundredpapercuts.assignee.assign_to_nobody
hundredpapercuts.comment_on_change: Unsubscribing the Papercutters team.

SQL time: 12570 ms
Non-sql time: 4316 ms
Total time: 16886 ms
Statement Count: 551

Big queries:
1 8487.0 1 launchpad-main-master SELECT MAX(Bug.heat) FROM Bug, Bugtask WHERE Bugtask.bug = Bug.id AND Bugtask.distribution = 1

1 310 1686 5 1681 launchpad-main-master SELECT EmailAddress.account, EmailAddress.email, EmailAddress.id, EmailAddress.person, EmailAddress.status FROM EmailAddress WHERE EmailAddress.person = %s AND EmailAddress.status = %s ORDER BY EmailAddress.email

This is the inverse of https://bugs.edge.launchpad.net/malone/+bug/615644, we have two mutually exclusive cases where the same query runs fast/slow depending on distribution=1/3

Related branches

Revision history for this message
Robert Collins (lifeless) wrote :

I think using this will be better for the heat :
select bug.heat from bug, bugtask where bugtask.bug = bug.id and bugtask.distribution = 1 order by bug.heat desc limit 1;
 heat
 -------
 11062
 (1 row)
 Time: 58.710 ms

description: updated
tags: added: dba
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

Heres a set of times for this by permutation:

MAX/LIMIT 1/3 D/DS/DSNN TIME
MAX 1 D 2000ms
MAX 3 D 150ms
MAX 1 DS 196ms
MAX 3 DS 2ms
LIMIT 3 DSNN 5000ms
LIMIT 3 DSNN 2ms
LIMIT 1 DSNN 200ms
LIMIT 1 DS 2ms
LIMIT 1 D 2ms
LIMIT 3 D 2ms

We have a LIMIT based query where both distro 1 and 3 perform well, for both the distro and distroseries cases.

LIMIT means order by bug.heat desc limit 1; max means max(bug.heat)
D means a query as per bugtarget.py on bugtask.distribution=X
DS means a query on bug, bugtask, distroseries where distoseries=X
DSNN means DS but with a strictly redundant 'AND bugtask.distroseries is not null'

As we know the number of distroseries is generally small I'm going to try an IN clause next.

Revision history for this message
Robert Collins (lifeless) wrote :

that variation performed equivalently to the DS/DSNN and is harder to read so not worth it. I'd say switching back to LIMIT and putting in the distroseries not null clauses is probably best for now.

08:34 < deryck> SELECT Bug.heat FROM Bug, Bugtask, DistroSeries
08:34 < deryck> WHERE Bugtask.bug = Bug.id AND
08:34 < deryck> Bugtask.distroseries IS NOT NULL AND
08:34 < deryck> Bugtask.distroseries = DistroSeries.id AND
08:34 < deryck> DistroSeries.distribution = 3 ORDER BY Bug.heat
08:34 < deryck> DESC LIMIT 1;

Deryck Hodge (deryck)
Changed in malone:
status: Triaged → In Progress
assignee: nobody → Deryck Hodge (deryck)
Revision history for this message
Deryck Hodge (deryck) wrote :

I had a poke at this a bit more today and I agree that adding the redundant not null clause is the best for now. It generally performs the best and most consistently across distro, distroseries, projects, project groups, etc. However, there are some combinations where I can't find a query to get below 300-400 ms. For now, that's okay, but the more quickly we drive down our timeout threshold, the more quickly this becomes a problem again.

We need to have a way to push this out of the web app. The jobs system broke down for us when used for heat before. Some of that could have been our use of it (firing off too many jobs or a job for every bug touched). But whatever we do, we need a way to defer this calculation only momentarily because of how much heat is used in the web UI now. i.e. we need to push this out of in-app calculation, but it still needs to happen pretty quickly so web pages are up to date.

I'm working on a branch now to make this minor query change, still leaving it in app, to buy us some time.

Cheers,
deryck

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 618403] Re: BugTask:+editstatus-page timing out in ~4% of requests

Alternatively we could denormalise this; put the max heat for a bug
target on the bug target, and update it when when bugs change; we
would still need to query it when the hottest bug for a target is
retargeted; but that should be pretty rare.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote : Bug fixed by a commit
Changed in malone:
milestone: none → 10.09
tags: added: qa-needstesting
Changed in malone:
status: In Progress → Fix Committed
Revision history for this message
Deryck Hodge (deryck) wrote :

The initial cause of timeouts related to bug heat is fixed, according to my attempts to +editstatus on staging on Ubuntu bugtasks.

tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 618403] Re: BugTask:+editstatus-page timing out in ~4% of requests

\o/

Curtis Hovey (sinzui)
Changed in malone:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.