launchpad process not exiting after make stop on staging

Bug #307447 reported by Herb McNew
6
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Stuart Bishop

Bug Description

For the last few days the staging restore has failed to restart the staging app server. It appears that the launchpad process isn't exiting after a 'make stop'.

Related branches

Revision history for this message
Diogo Matsubara (matsubara) wrote :

Are there any relevant logs that would help debug this?

Changed in launchpad:
status: New → Incomplete
Revision history for this message
Steve McInerney (spm) wrote :

We see this across a range of services: eg edge*; lpnet*; xmlrpc

The most recent case was edge4.
AFAICT, there are no relevant logs.

Symptoms observed:
* The service in question is no longer responsive
* simple HTTP ping (alive/dead) style checks show the service as being alive
* more detailed checks (looking for XYZ string back) show a fail
* the service will be working fine until the 'make stop' which leaves it in this never-never land

We have an RT against the LOSAs to add a 'die die die' loop to the init.d script; but that's a fairly harsh workaround.

Changed in launchpad-foundations:
status: Incomplete → New
Revision history for this message
Stuart Bishop (stub) wrote :

Rather than put the loop in the init.d script, why not just put it in the Makefile so everything benefits?

I doubt we can track down the cause, and even if we did, we have no idea that we tracked down all of the causes or that new issues in Launchpad, Ubuntu, our hundred odd immediate dependencies or the dependencies of the dependencies will remain bug free and hand proof in the future.

Revision history for this message
Tom Haddon (mthaddon) wrote :

I think that's a fair point, Stuart. I think it best for the foundations team to determine the best way to handle this if it's going to be done within the Makefile.

Adjusting status and priority as it impacts edge rollouts (among other things).

Changed in launchpad-foundations:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Gary Poster (gary) wrote :

Stuart, if I'm right that putting this in the Makefile will be short work (less than a couple of hours) for you, could you tackle this for 3.0? If I'm wrong, please feel free to change the milestone to 3.1, or to bring it up with me.

Thanks

Changed in launchpad-foundations:
assignee: nobody → Stuart Bishop (stub)
milestone: none → 3.0
Revision history for this message
Tom Haddon (mthaddon) wrote :

This causes intermittent failures in the automatic edge update process if one of the app servers refuses to die. Latest one was Sept 9th.

Changed in launchpad-foundations:
status: Confirmed → Triaged
Stuart Bishop (stub)
Changed in launchpad-foundations:
status: Triaged → In Progress
Revision history for this message
Stuart Bishop (stub) wrote :

killservice (and thus make stop) now kills services using a SIGKILL if the SIGTERM did not cause a shutdown after 20 seconds. This timeout can be overridden on the command line.

Changed in launchpad-foundations:
status: In Progress → Fix Committed
Stuart Bishop (stub)
Changed in launchpad-foundations:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.