checkwatches doing an unreasonable amouht of db work

Bug #435952 reported by Stuart Bishop
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

checkwatches is noticeably one of the higher consumers of database resources, often chewing up 30-70% of a core all on its own. This seems unreasonable given its job, and may indicate a sily bug or badly thought out piece of code.

I suspect something like:
 - missing join condition
 - unnecessary repetition of queries
 - suboptimal queries
 - retrieving lots of rows one row at a time

We should instrument it and work out wtf it is spending its time. LP_DEBUG_SQL or LP_DEBUG_SQL_EXTRA might provide enough information if we summarize it.

It it does need to do this much work, we should probably try and offload some or most of it to a slave database.

Revision history for this message
Stuart Bishop (stub) wrote :

Assigning to gmb because he said he would take a look.

This isn't a major production concern... yet ;)

Changed in malone:
assignee: nobody → Graham Binns (gmb)
importance: Undecided → Medium
status: New → Triaged
Graham Binns (gmb)
Changed in malone:
assignee: Graham Binns (gmb) → nobody
Revision history for this message
Stuart Bishop (stub) wrote :

This is now bringing the database to its knees, as checkwatches now has 6 connections open each hammering the database.

Increasing this to critical until we can reduce the number of database connections or fix the underlying performance problem.

Changed in malone:
importance: Medium → Critical
Revision history for this message
Tom Haddon (mthaddon) wrote :

Per a suggestion from gmb, we've applied http://pastebin.ubuntu.com/365099/ to the crontab that runs checkwatches until this is fixed.

Stuart Bishop (stub)
Changed in malone:
importance: Critical → Medium
Gavin Panella (allenap)
tags: added: story-reliable-bug-syncing
Changed in malone:
importance: Medium → High
Tom Haddon (mthaddon)
tags: added: canonical-losa-lp
Revision history for this message
Gavin Panella (allenap) wrote :

The fix for bug 572211 should reduce db load a lot, I hope.

Revision history for this message
Robert Collins (lifeless) wrote :

Tom, could you please undo that workaround, so we can see how much allenap's change helped?

summary: - checkwatches doing an unreasonable amout of db work
+ checkwatches doing an unreasonable amouht of db work
Revision history for this message
Liam Young (gnuoy) wrote :

Reverted checkwatches to doing ten at a time (--jobs=10)

Revision history for this message
David Ames (thedac) wrote :

Reverted the revert. Now --jobs=1

Revision history for this message
William Grant (wgrant) wrote :

(It was getting pgkillactive'd again.)

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 435952] Re: checkwatches doing an unreasonable amouht of db work

Did something go wrong?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.