Comment 3 for bug 588684

Revision history for this message
Michael Nelson (michael.nelson) wrote :

The cowboy has been applied to edge (and is currently being applied to production) so that the builders page is up again (with 'unknown' queue lengths/times).

stub managed to get a real explain analyse from production, which shows quite a different plan from that on dogfood. At this stage we're still no closer to finding out why the query suddenly blew up. The change was straight-forward: http://pastebin.ubuntu.com/443408/ . stub sees that without the archive.enabled/virtualized clauses, it works fine (but that hasn't changed).

Details:
{{{
16:28 * stub wonders if the database planner statistics had not been updated
16:29 < stub> hmm..
16:36 < stub> noodles: Got a real query from the postgresql logs
16:37 < noodles> stub: great... does it tell you something more useful?
16:37 < stub> noodles: https://pastebin.canonical.com/32859/
16:38 < stub> At one point, it is materializing 920064 rows
16:38 < noodles> a bad join? /me looks closer.
16:38 < stub> But it thought it was only going to materialize 5k rows
16:51 < noodles> stub: were you able to see if it was related to the planner stats?
17:00 < stub> noodles: I think it is because the bulk of our archives are enabled and don't require virtualization, but the planner isn't smart enough to realize that the data is skewed in that way, and it joins tables in a suboptimal order.
17:02 < noodles> stub: s/don't require/do require/? (most ppas require virtualization), but I'm glad you're seeing reasons... I keep going over the query itself but can't see anything strange.
17:02 < stub> The enabled flag makes the biggest difference.
17:03 < noodles> Aha. But why would that result in 900k rows instead of 5k during a join?
17:12 < stub> noodles: I'm not sure if the plan is wrong - the numbers are skewed, but it still might be the fastest plan.
17:13 < noodles> hrm.
17:14 * noodles gets a diff of what changed for that specific query.
17:20 < noodles> stub: the diff itself looks fine http://pastebin.ubuntu.com/443408/ and the indexes are there for the foreign keys etc.
17:21 < noodles> Chex: great, thankyou... I've just checked https://edge.launchpad.net/builders and it worked perfectly. Please update production as well.
17:21 < stub> If I remove the enabled and requires_virtual clauses, it works fine
17:21 < stub> Well... 0.5-1 second
17:22 < Chex> noodles: great, thanks, on it now
17:26 < noodles> stub: there are no indexes on archive.enabled or archive.requires_virtual (nor have there been in the past)... but I assume that would help?
17:27 < stub> no - I've tried that already
17:27 < noodles> :/
}}}