Node event log queries are slow (over 1 second)

Bug #1402237 reported by Jason Hobbs on 2014-12-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
High
Graham Binns
1.7
High
Graham Binns

Bug Description

I turned on slow query logging in postgresql and noticed that node event log queries are taking over 1 second to complete:

2014-12-13 17:10:30 UTC LOG: duration: 1386.208 ms statement: SELECT "maasserver_event"."id", "maasserver_event"."created", "maasserver_event"."updated", "maasserver_event"."type_id", "maasserver_event"."node_id", "maasserver_event"."description" FROM "maasserver_event" INNER JOIN "maasserver_eventtype" ON ( "maasserver_event"."type_id" = "maasserver_eventtype"."id" ) WHERE ("maasserver_event"."node_id" = 896 AND NOT ("maasserver_eventtype"."level" = 10 )) ORDER BY "maasserver_event"."id" DESC LIMIT 5
2014-12-13 17:10:34 UTC LOG: duration: 1171.199 ms statement: SELECT "maasserver_event"."id", "maasserver_event"."created", "maasserver_event"."updated", "maasserver_event"."type_id", "maasserver_event"."node_id", "maasserver_event"."description" FROM "maasserver_event" INNER JOIN "maasserver_eventtype" ON ( "maasserver_event"."type_id" = "maasserver_eventtype"."id" ) WHERE ("maasserver_event"."node_id" = 896 AND NOT ("maasserver_eventtype"."level" = 10 )) ORDER BY "maasserver_event"."id" DESC LIMIT 5

There's no way to clear these log entries, and I have 100+ nodes, each with over 10,000 event log entries.

Related branches

tags: added: oil
Blake Rouse (blake-rouse) wrote :

I am very surprised this is so slow, since the query is limited to only 5 results. Must just be to the overall size of the table.

A cleaning system is need to make sure that the event log doesn't get to large, maybe so many entries per node or only so many days old. The number of days would probably be the best allowing it to be tune-able.

Changed in maas:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.7.2
milestone: 1.7.2 → next
Jason Hobbs (jason-hobbs) wrote :

It definitely was a side effect of the number of entries. I manually deleted all of the entries over 7 days old and the query got much faster.

Raphaël Badin (rvb) wrote :

10,000 entries shouldn't really be a problem. My guess is that either the join (event + eventtype) or the sorting is causing the slowness. We need to check indexes, etc.

Jason Hobbs (jason-hobbs) wrote :

To be clear it's not 10,000 entries - it's 10,000 * 150.

Raphaël Badin (rvb) wrote :

> To be clear it's not 10,000 entries - it's 10,000 * 150.

Arg, then we might have a problem due to the volume alone. But it's still worth make sure all the required indexes are in place.

tags: added: perf

On 15 December 2014 at 15:04, Jason Hobbs <email address hidden> wrote:
> To be clear it's not 10,000 entries - it's 10,000 * 150.

Right, but for one node there's only ~10,000 entries. I'll run these
through a query analyser and see what comes up as a chokepoint.

For reference, Launchpad Bugs has 1.4 million entries, and in order to
cope with that we had to tune the queries, not start deleting data.
Though I think having some way of clearing event history is doubtless
a Good Thing, it's not the prime fix here. We shouldn't say "Oh, page
X is slow? Well, you need to delete things." That's the Windows way,
and it's ropey™.

Graham Binns (gmb) wrote :

On 15 December 2014 at 15:14, Raphaël Badin <email address hidden> wrote:
> Arg, then we might have a problem due to the volume alone.

True, but *only* if the query is not written well enough. I maintain
(with misplaced certitude, perhaps) that this is tuneable, but I may
be being James May to Jason's Jeremy Clarkson here.

Graham Binns (gmb) wrote :

I've filed bug 1403080 about not being able to clear the event log for a Node. Raphaël and I think we've come up with an elegant-ish solution for the problem here, however.

Changed in maas:
status: Triaged → In Progress
assignee: nobody → Graham Binns (gmb)
summary: - Node event log queries are slow (over 1 second), but no way to clear
- node event log
+ Node event log queries are slow (over 1 second)
Christian Reis (kiko) on 2015-01-28
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Andres Rodriguez (andreserl) wrote :

This bug has been reported and fixed on upstream MAAS. However, provided that the bug was listed on the debian changelog, this appears as needing verification for pending SRU [1]. This bug did not affect current MAAS in Ubuntu, hence setting this to verification-done to unblock pending SRU.

[1]:http://people.canonical.com/~ubuntu-archive/pending-sru.html

tags: added: verification-done
Changed in maas:
milestone: next → none
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers