Launchpad says it's "offline for maintenance" when it isn't

Bug #121828 reported by Matthew Paul Thomas
14
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Medium
Matthew Paul Thomas

Bug Description

* mpt wonders why he's getting "Launchpad is offline for maintenance" messages when it isn't
...
<brylie> I got one of those also mpt
<superm1> i'm getting them all over code.launchpad.net myself too

If Launchpad has a problem that is not caused by scheduled maintenance, we should be shown a different error message. Showing the "offline for maintenance" message sporadically, over several hours, is confusing.

Revision history for this message
Sarah Kowalik (hobbsee-deactivatedaccount) wrote :

i see them too - yet, when you hit refresh, it goes away.

Go figure...

Revision history for this message
Steve Kowalik (stevenk) wrote :

I've seen them sporadically over the last few days. My thought is one of the servers is returning them, but when you hit Refresh, another server handles the request and deals.

Tom Haddon (mthaddon)
Changed in launchpad:
assignee: nobody → mthaddon
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Tom Haddon (mthaddon) wrote :

There is now an updated "offline" page which has more detail about planned outage and what to do if you are seeing the offline page outside of maintenance hours.

Changed in launchpad:
status: Confirmed → Fix Released
Revision history for this message
Matthew Paul Thomas (mpt) wrote :

That doesn't fix the problem described in this bug report.

Changed in launchpad:
status: Fix Released → Confirmed
Revision history for this message
Tom Haddon (mthaddon) wrote :

If you're seeing this page outside of maintenance hours, it means that there's some kind of infrastructure problem, such as the web server contacting the load balancer, or the load balancer contacting the application servers. Avoiding this from ever happening again would be pretty much impossible - my comment is to try and setup a clearer process for reporting issues so we can resolve them (if possible) on a case by case basis, and identify better when they're happening so that we can work towards reducing/eliminating them.

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

Tom, the problem here is not that there are ever problems with the servers contacting each other. The problem is that Launchpad claims it is offline for maintenance when it is not.

There are people who know when Launchpad is genuinely offline for maintenance, namely the people who are doing that maintenance. Therefore those people should be able to activate a page that says something like "Sorry, Launchpad is offline for maintenance, try again in 30 minutes". Meanwhile, the normal error page can say "Sorry, there was a temporary problem, try reloading".

Revision history for this message
Tom Haddon (mthaddon) wrote :

Okay, the text was changed from what it was before (a generic "Launchpad is offline") to something providing more detail - particularly the link to the maintenance page on the wiki, which has details of scheduled outages and what to do if the "offline" page is encountered outside of scheduled maintenance hours. If this isn't currently clear enough, then we need to change the text of the page.

We want to avoid having to change the offline page itself since this is checked into revision control and it becomes more troublesome to make sure everything is changed at the right time, therefore the idea of linking to a page with more detail was born. I'm happy to change the text to whatever can be agreed upon as a way of covering both eventualities (offline for maintenance, and offline because of unknown issues, and what to do in that case).

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

I don't think the page can be made clear enough as long as it is trying to fulfil two very different purposes. It's like if we had a single page that covered both maintenance and 404 errors: no amount of rewriting could fix it.

I'm not suggesting that the offline page be changed in version control during each rollout. I'm suggesting that we have two pages, both permanently in version control: one displayed during maintenance, and the other set up to display automatically during server glitches. Please tell me this is possible. :-) Then I'll make the separate glitch page, and you can work your wizardry to set it up.

Revision history for this message
Tom Haddon (mthaddon) wrote : Re: [Bug 121828] Re: Launchpad says it's "offline for maintenance" when it isn't

On Sat, 2007-09-01 at 09:32 +0000, Matthew Paul Thomas wrote:
> I don't think the page can be made clear enough as long as it is trying
> to fulfil two very different purposes. It's like if we had a single page
> that covered both maintenance and 404 errors: no amount of rewriting
> could fix it.
>
> I'm not suggesting that the offline page be changed in version control
> during each rollout. I'm suggesting that we have two pages, both
> permanently in version control: one displayed during maintenance, and
> the other set up to display automatically during server glitches. Please
> tell me this is possible. :-) Then I'll make the separate glitch page,
> and you can work your wizardry to set it up.
>

We can do this, yes. Currently there's a page called "offline.html" in
lib/canonical/launchpad. If we can create two pages here (offline.html
and maintenance.html, for example), I can simply change the link to this
page ahead of any maintenance work, and then change it back at the end
of maintenance work.

I think we'll still need to link to the news.launchpad.net/maintenance
page for the details of the maintenance work to avoid having to change
it in bazaar, but I think in concept that should work as a way of
providing two separate pages.

I'm not sure who would be the best person to create/wordsmith those
pages - maybe yourself and Matthew Revell?

Thanks, Tom

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

Yep. I'll take this as part of my 1.0-503-page work, and pass back to you when done (unless you let me know that's unnecessary). Thanks!

Changed in launchpad:
assignee: mthaddon → mpt
Changed in launchpad:
status: Confirmed → In Progress
Revision history for this message
Matthew Paul Thomas (mpt) wrote :

Fixed in mainline r4862.

Changed in launchpad:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.