buildds don't auto-restart on reboot

Bug #31546 reported by James Troup
4
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Celso Providelo

Bug Description

machines in the data centre are vulnerable to random and arbitrary reboots (e.g. for kernel security fixes or physical relocation) - our services need to cope with this. wanna-build's buildd handled this by having a cronned buildd-watcher process which ran every 5 mins and (among other things) checked for a running buildd process. If it couldnt' find one and there wasn't a ~buildd/NO-DAEMON-PLEASE file, it started one up.

I'd recommend LP do something similar. OTOH, if you want something simpler, I'd not recommend @reboot in cron, as that's buggy and e.g. will kick in on cron daemon restart (which is ok I guess as long as your buildd start process is clever enough to not start dueling buildds).

James Troup (elmo)
Changed in launchpad-buildd:
assignee: nobody → cprov
status: Unconfirmed → Confirmed
Revision history for this message
Daniel Silverstone (dsilvers) wrote :

The buildds themselves have an init script to start them up. I'm guessing it's the build master which needs to detect certain failure modes and re-test to reenable the buildds in the db periodically

Revision history for this message
Celso Providelo (cprov) wrote :

If a builder has been marked as NOT OK by the builddmaster due a lack of communication during the reboot (almost sure it was). You need to reenable it via +admin page or even using buildd-monitor script.

A special mode would require two manual intervention (one to put the builder in 'special' mode, another to bring it back to normal mode), so I still prefer the current solution, reenable the failure builds after reboot.

We may check even NOT OK builder once a while with a kind of whatdog script or even embedded it in buildd-sequencer. What do you think ?

Curtis Hovey (sinzui)
tags: added: tech-debt
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.