Tell automated clients when Launchpad is down for maintenance (as opposed to just down)

Bug #590956 reported by Leonard Richardson
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
Low
Unassigned

Bug Description

As bugs like bug 380504 demonstrate, automated clients frequently encounter transient errors when accessing Launchpad. Right now there's no way for an automated client to tell whether there's something wrong with Launchpad or if it's just down for maintenance. A reliable way of telling the difference would simplify clients considerably. (The determination could be built into launchpadlib, simplifying them even more.)

Here are two solutions. The first is to serve a resource at a well-known URL like /down-for-maintenance. If Launchpad is functioning normally, a GET to that URL will return 404. If Launchpad is malfunctioning, a GET to that URL might return 404, or it might return a 5xx error code. Only if Launchpad is down for maintenance will a GET to that URL return 200.

A more sophisticated solution is to serve the Retry-After header whenever someone makes an HTTP request but Launchpad has been taken down for maintenance. The Retry-After header could represent the actual estimated amount of time before the maintenance period ends, or it could be an arbitrary amount of time like one hour. A client experiencing strange Launchpad behavior could simply check the Retry-After header and discover whether Launchpad knows its behavior is strange, or whether Launchpad is simply malfunctioning.

affects: launchpad → launchpad-foundations
Revision history for this message
Bryce Harrington (bryce) wrote :

Yes, this would be quite handy. My various cron jobs all go nuts during the monthly rollouts and send me spurious error emails.

I think either approach would enable me to differentiate between legitimate failures and just downtime. The first approach is nice in that I can check it early on before trying to invoke any launchpadlib commands. The second is nice in that it gives you an estimated time it'll be back (although for cron jobs I'd probably just have it terminate and try again the next cycle rather than wait until it's back).

I'm not sure if this bug would cover it, but the other situation that would be nice to know is if launchpad is down due to an unexpected malfunction (as opposed to a scripting bug, launchpad bug, or network outage).

Revision history for this message
Gary Poster (gary) wrote :

Triaging as low but it sounds easy: maybe we can schedule it soon.

Changed in launchpad-foundations:
status: New → Triaged
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.