soyuz is not fast-downtime friendly

Bug #845407 reported by Tom Haddon
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

Soyuz cannot currently cope with fastdowntime, which means a lot of workarounds to allow us to do fastdowntime:

- Disabe notifications of soyuz services in nagios
- Stop individual cron jobs to prevent them being long running
- Stop soyuz services before rollout
- Stop all soyuz crontabs before rollout
- Bring soyuz services back up after rollout
- Re-enable crontabs after rollout
- Re-enable nagios alerts after rollout

The idea of fastdowntime is that we shouldn't have to do any of these, which will make the process less error prone and quicker

Tom Haddon (mthaddon)
tags: added: canonical-losa-lp fastdowntime
Changed in launchpad:
importance: Undecided → High
Revision history for this message
Stuart Bishop (stub) wrote :

Last step of this is to remove archive-publisher from the fragile users list in database/schema/preflight.py

Changed in launchpad:
status: New → Triaged
Revision history for this message
Tom Haddon (mthaddon) wrote :

process-death-row needed manual killing during today's rollout

Revision history for this message
Tom Haddon (mthaddon) wrote :

And buildd-retry-depwait on cesium.

Revision history for this message
Stuart Bishop (stub) wrote :

Bug #845397 is about making buildd-manager cope gracefully.

Revision history for this message
Stuart Bishop (stub) wrote :

Bug #815753 is about making scripts like process-death-row and buildd-retry-depwait connect as distinct database users, so they can be differentiated from other connections.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.