Codebrowse can lose its pidfile if restarted too quickly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
Low
|
Unassigned |
Bug Description
Tom Haddon wrote:
> On Mon, 2008-04-28 at 17:29 +1200, Michael Hudson wrote:
>> I don't know if you've noticed this too, but sometimes loggerhead
>> manages to lose its loggerhead.pid file. I think what happens is that
>> when we run "/etc/init.
>>
>> restart)
>> $0 stop
>> sleep 1
>> $0 start
>> ;;
>>
>> it takes much longer than 1 second for loggerhead to exit, so what
>> happens is this:
>>
>> 1) 'stop' SIGINTs the old process
>> 2) the new process starts, overwriting the loggerhead.pid file
>> 3) the old process exits and removes the old loggerhead.pid file.
>>
>> After this has happened, using the init.d script again doesn't achieve
>> anything as the 'stop' can't kill the process (no pid file) and the
>> 'start' doesn't work either (it can't bind the port).
>>
>> For a fix, I can think of two things: 1) don't run 'start' until the pid
>> file is gone or 2) try to remove the pid file earlier in the shut-down
>> process. I guess I favor 1).
>>
>
> I prefer 1) too. How would we implement that, though? And what happens
> if the pidfile doesn't get removed? I would assume we'd sleep for some
> period of time, and then if the pidfile isn't removed, check if the
> process it's talking about is still running, kill it, remove the pidfile
> and carry on?
Yes, I guess that makes sense. Perhaps it makes sense for the
stop-loggerhead.py script to not exit until it thinks the old process is
dead? Currently it just sends it a signal and exits, I think.
Cheers,
mwh
affects: | launchpad-foundations → launchpad-code |
tags: | added: codebrowse |
Changed in launchpad-code: | |
importance: | Undecided → Medium |
status: | New → Triaged |
visibility: | private → public |
Changed in launchpad: | |
importance: | Medium → Low |