buildd-slave-scanner.py regularly aborts loudly on transient errors
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Won't Fix
|
High
|
Celso Providelo |
Bug Description
The LP error reports list gets regular failures from buildd-
Apparently from transient connection failures.
These appear to be a case of 'boy crying wolf'; generating a lot of email masking (potentially) more serious problems. ie I've received over 70 in the past 24 hours. Properly speaking, each and every email should be examined in detail to see if the failure is in fact serious or not. This obviously takes time. :-)
eg: https:/
Would be greatly appreciated! :-) if these could be either quietened, or recommendations given to a more appropriate action?
As the pastebin shows several "WARNING builder is in manual state. Ignored." messages, can these major aborts be dealt with in a similar fashion?
Changed in soyuz: | |
assignee: | nobody → al-maisan |
importance: | Undecided → High |
milestone: | none → 2.1.12 |
status: | New → Triaged |
Changed in soyuz: | |
milestone: | 2.1.12 → pending |
Changed in soyuz: | |
milestone: | pending → 2.2.1 |
Changed in soyuz: | |
milestone: | 2.2.1 → 2.2.2 |
Changed in soyuz: | |
assignee: | al-maisan → cprov |
milestone: | 2.2.2 → 2.2.3 |
Steve, do you know why there are transient socket errors in the first place? How normal is it?
I am concerned that it's a genuine problem that should be fixed in the infrastructure and any script changes could mask a bigger issue.