Lost builders don't have their jobs unassigned on rescue

Bug #1221002 reported by William Grant
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
William Grant

Bug Description

If a launchpad-buildd slave's idea of its current job doesn't match the master's, it is declared to be lost and BuilderInteractor.rescueIfLost attempts to abort the build and clean the slave up. But SlaveScanner.scan doesn't notice that a rescue is in progress, and continues to call updateBuild on the build that it knows to no longer be present on the builder. It should probably immediately reset the job so it can be dispatched to another builder while the lost slave is cleaned up.

This bug currently resolves in lost builders occasionally getting failed when they eventually transition to WAITING/ABORTED, because the status gets inappropriately passed down to a job that is in the wrong state. Affected builders can be recovered by simply reenabling them, as the cleanup has already taken place.

Related branches

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
William Grant (wgrant)
tags: added: qa-ok
removed: qa-needstesting
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.