should track and then disable failing jobs

Bug #693241 reported by Martin Pool
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
Low
Unassigned

Bug Description

In the context of <https://code.launchpad.net/~mbp/launchpad/690021-rlimit/+merge/43733>:

Launchpad should notice when a job repeatedly fails, and eventually disable it, and log what's happening.

There may already be code for this in soyuz or branch imports? It should be put into a standard place.

Detecting failed jobs should include detecting when the process running them simply disappeared because of eg hitting a resource limit or crashing without a Python exception.

Tags: jobs
Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: [Bug 693241] [NEW] should track and then disable failing jobs

On Wednesday 22 December 2010 01:43:48 you wrote:>
> There may already be code for this in soyuz or branch imports? It
> should be put into a standard place.

The buildd-manager has rudimentary detection by counting failures and trying
to work out whether the job or the builder (job runner in effect) is failing
more.

We could generalise this into a mixin implementing IHasFailures or similar.

Revision history for this message
Robert Collins (lifeless) wrote :

+1 on generalisation. I wouldn't use a mixin though. The general
pattern is called 'Circuit Breaker', see 'Release it', Michael Nygard.

Tim Penhey (thumper)
tags: added: jobs
Changed in launchpad:
status: New → Triaged
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.