buildd-manager disables builders with empty failnotes on some CancelledErrors
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
High
|
Unassigned |
Bug Description
The effect here is simple. On a semi-regular basis, Launchpad will mark builders "Not OK" but it will fail to put an error message of any sort in the builder's record. This is particularly common lately with the guests on furud.ppa (akhlut, alphard, dryad, elnath, gumiho, hamsa, marid, mekbuda, menkalinan, menkib, and peryton) but I'm led to understand that this happens to others.
The webops team have an alert that warns them of builders in this state, and their solution is often just to put "somebody must be testing something" messages in. It would be helpful for us to get some kind of OOPS data someplace when a builder is de-activated, but that may be more in scope with LP#874072.
Basically I'd just like to know what kind of "doesn't work" these builders are doing, exactly!
This is a long-standing issue. It's not quite that it fails to put an error message in failnotes; the error message is ''. buildd-manager logs show in most cases that we get back a Twisted CancelledError with no content, which normally indicates a timeout.