Bug.initial_message fetches all of the bug's messages

Bug #606914 reported by Jeroen T. Vermeulen on 2010-07-18
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
High
Graham Binns

Bug Description

Bug.initial_message sorts Bug.messages using Python's sorted() builtin and then extracts the first message.

That's a bad idea because it forces the ORM to fetch and deserialize all of a bug's messages just in order to get the first one. There could be hundreds of messages.

This problem could conceivably be a factor behind bug 497386, in which case the easy fix for both is to sort and slice Bug.messages directly without listifying it python-side first.

Related branches

Robert Collins (lifeless) wrote :

Hi, this is implicated in a very frequent oops we had over the weekend - one particular bug API call in particular caused 4000 timeouts :(.

4098 SELECT Message.datecreated, Message.id, Message.owner, Message.parent, Message.raw, Message.rfc82 ... = %s AND Message.id = BugMessage.message AND ($INT=$INT) ORDER BY Message.datecreated, Message.id:
   GET: 4098 Robots: 0 Local: 0
   4098 https://api.launchpad.net/1.0/bugs/88746/attachments (ScopedCollection:CollectionResource)
       OOPS-1659B1002, OOPS-1659B1007, OOPS-1659B1008, OOPS-1659B1009, OOPS-1659B1010

If this issue is the root cause (it may not be) then this bug will block further improvements in the timeout window for production until fixed;

Martin Pool (mbp) wrote :

see also bug 424671 (arguably a dupe) and bug 606911 (the thing that should fix this but seems to be dead code) and bug 606952 (would have helped solve it) and bug 606959 (would make it more obvious where the problem is) </mpt>

Graham Binns (gmb) on 2010-07-19
Changed in malone:
status: New → Triaged
importance: Undecided → High
tags: added: timeout
Graham Binns (gmb) on 2010-07-19
Changed in malone:
status: Triaged → In Progress
assignee: nobody → Graham Binns (gmb)
Jeroen T. Vermeulen (jtv) wrote :

On further inspection I'm pretty certain this is not the cause of the timeouts, but it's definitely a bad thing. All the more so if we're often fetching it unnecessarily. I'm working up a fix—the main question is why initial_message sorts by id instead of by creation date and id. May be a performance issue.

Graham Binns (gmb) on 2010-07-19
Changed in malone:
status: In Progress → Triaged
assignee: Graham Binns (gmb) → nobody

On 19 July 2010 14:54, Jeroen T. Vermeulen <email address hidden> wrote:
> On further inspection I'm pretty certain this is not the cause of the
> timeouts, but it's definitely a bad thing.  All the more so if we're
> often fetching it unnecessarily.  I'm working up a fix—the main question
> is why initial_message sorts by id instead of by creation date and id.
> May be a performance issue.

I've already got a fix ready to land (see linked, then unlinked, then
re-linked branch). Is there a particular reason to sort by creation
date and id?

Changed in malone:
assignee: nobody → Graham Binns (gmb)
milestone: none → 10.08
tags: added: qa-needstesting
Changed in malone:
status: Triaged → Fix Committed

If this fixes the issue on staging, lets CP it to prod - we're really
suffering at the moment (and it was terrible already - it must have
been just under the hard timeout).

Stuart Bishop (stub) wrote :

On Mon, Jul 19, 2010 at 9:42 PM, Graham Binns <email address hidden> wrote:
> On 19 July 2010 14:54, Jeroen T. Vermeulen <email address hidden> wrote:
>> On further inspection I'm pretty certain this is not the cause of the
>> timeouts, but it's definitely a bad thing.  All the more so if we're
>> often fetching it unnecessarily.  I'm working up a fix—the main question
>> is why initial_message sorts by id instead of by creation date and id.
>> May be a performance issue.
>
> I've already got a fix ready to land (see linked, then unlinked, then
> re-linked branch). Is there a particular reason to sort by creation
> date and id?

sorting by id is equivalent to sorting by datecreated, as creation
date should never change.

sorting by datecreated annoys tests, as all messages created in the
same transaction have the same datecreated and have an indeterminate
order. We also don't have an index on Message.datecreated, so this
would currently slow.

For the latest message, sorting by (datecreated DESC, id DESC) would
arguably be the most correct, but just sorting by id is faster,
requires no extra indexes to maintain, and meets our needs.

So please just sort by id unless you want to fix this bug twice :)

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Robert Collins (lifeless) wrote :

https://bugs.edge.launchpad.net/malone/+bug/121363 is related - same basic query, same sort tweak FWICT.

Graham Binns (gmb) on 2010-08-09
tags: added: qa-ok
removed: qa-needstesting
Graham Binns (gmb) on 2010-08-12
Changed in malone:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers