Launchpad itself

checkwatches doing an unreasonable amouht of db work

Bug #435952 reported by Stuart Bishop on 2009-09-24

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Triaged	High	Unassigned

Bug Description

checkwatches is noticeably one of the higher consumers of database resources, often chewing up 30-70% of a core all on its own. This seems unreasonable given its job, and may indicate a sily bug or badly thought out piece of code.

I suspect something like:
- missing join condition
- unnecessary repetition of queries
- suboptimal queries
- retrieving lots of rows one row at a time

We should instrument it and work out wtf it is spending its time. LP_DEBUG_SQL or LP_DEBUG_SQL_EXTRA might provide enough information if we summarize it.

It it does need to do this much work, we should probably try and offload some or most of it to a slave database.

Tags:

Revision history for this message

Stuart Bishop (stub) wrote on 2009-09-24:

Assigning to gmb because he said he would take a look.

This isn't a major production concern... yet ;)

Changed in malone:
assignee:	nobody → Graham Binns (gmb)
importance:	Undecided → Medium
status:	New → Triaged

Graham Binns (gmb) on 2009-12-11

Changed in malone:
assignee:	Graham Binns (gmb) → nobody

Revision history for this message

Stuart Bishop (stub) wrote on 2010-01-29:

This is now bringing the database to its knees, as checkwatches now has 6 connections open each hammering the database.

Increasing this to critical until we can reduce the number of database connections or fix the underlying performance problem.

Changed in malone:
importance:	Medium → Critical

Revision history for this message

Tom Haddon (mthaddon) wrote on 2010-01-29:

Per a suggestion from gmb, we've applied http://pastebin.ubuntu.com/365099/ to the crontab that runs checkwatches until this is fixed.

Stuart Bishop (stub) on 2010-02-02

Changed in malone:
importance:	Critical → Medium

Gavin Panella (allenap) on 2010-04-20

tags:	added: story-reliable-bug-syncing
Changed in malone:
importance:	Medium → High

Tom Haddon (mthaddon) on 2010-05-28

tags:

added: canonical-losa-lp

Revision history for this message

Gavin Panella (allenap) wrote on 2010-05-28:

The fix for bug 572211 should reduce db load a lot, I hope.

Revision history for this message

Robert Collins (lifeless) wrote on 2012-01-03:

Tom, could you please undo that workaround, so we can see how much allenap's change helped?

summary:

- checkwatches doing an unreasonable amout of db work
+ checkwatches doing an unreasonable amouht of db work

Revision history for this message

Liam Young (gnuoy) wrote on 2012-01-03:

Reverted checkwatches to doing ten at a time (--jobs=10)

Revision history for this message

David Ames (thedac) wrote on 2012-01-03:

Reverted the revert. Now --jobs=1

Revision history for this message

William Grant (wgrant) wrote on 2012-01-03:

(It was getting pgkillactive'd again.)

Revision history for this message

Robert Collins (lifeless) wrote on 2012-01-03: Re: [Bug 435952] Re: checkwatches doing an unreasonable amouht of db work

Did something go wrong?

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.