init: stopping and stopped should indicate if job is to be respawned

Reported by Matt Cowell on 2011-02-11
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
upstart
Wishlist
Unassigned

Bug Description

Say I have two jobs: foo and bar. bar is marked "respawn" and provides some service used by 'foo'. foo is marked "stop on stopping bar" so that it will stop when bar is no longer available. bar does not maintain any state, so that if it crashes, it can safely be respawned without affecting the functionality of foo.

The problem is that I need a way here to determine if 'bar' crashed and was respawned, or just was actually stopped using the "stop" command. A job which will be respawned is not marked as failed unless it is respawning too fast, and the "stopping" event is still sent with "RESULT=ok", just as it would be if it were stopped using "stop" or "stop on", never to come back.

It would be beneficial to provide either another value for "RESULT" or another environment variable which indicates whether the stopping event is due to a respawn. It could then be used in the following manner (using my foo/bar example above -- showing foo):

stop on stopping bar RESULT!=respawn

-or-

stop on stopping bar RESPAWN!=yes

description: updated
description: updated
Download full text (3.8 KiB)

Respawn jobs have:

  stop on stopping bar PROCESS=respawn

did this not work for you?

Scott

On Thu, Feb 10, 2011 at 4:51 PM, Matt Cowell <email address hidden>wrote:

> Public bug reported:
>
> Say I have two jobs: foo and bar. bar is marked "respawn" and provides
> some service used by 'foo'. foo is marked "stop on stopping bar" so
> that it will stop when bar is no longer available. bar does not
> maintain any state, so that if it crashes, it can safely be respawned
> without effecting the functionality of foo.
>
> The problem is that I need a way here to determine if 'bar' crashed and
> was respawned, or just was actually stopped using the "stop" command. A
> job which will be respawned is not marked as failed unless it is
> respawning too fast, and the "stopping" event is still sent with
> "RESULT=ok", just as it would be if it were stopped using "stop" or
> "stop on", never to come back.
>
> It would be beneficial to provide either another value for "RESULT" or
> another environment variable which indicates whether the stopping event
> is due to a respawn. It could then be used in the following manner
> (using my foo/bar example above -- showing foo):
>
> stop on stopping bar RESULT!=respawn
>
> -or-
>
> stop on stopping bar RESPAWN!=yes
>
> ** Affects: upstart
> Importance: Undecided
> Status: New
>
> ** Description changed:
>
> Say I have two jobs: foo and bar. bar is marked "respawn" and provides
> some service used by 'foo'. foo is marked "stop on stopping bar" so
> that it will stop when bar is no longer available. bar does not
> maintain any state, so that if it crashes, it can safely be respawned
> without effecting the functionality of foo.
>
> The problem is that I need a way here to determine if 'bar' crashed and
> was respawned, or just was actually stopped using the "stop" command. A
> job which will be respawned is not marked as failed unless it is
> respawning too fast, and the "stopping" event is still sent with
> "RESULT=ok", just as it would be if it were stopped using "stop" or
> "stop on", never to come back.
>
> It would be beneficial to provide either another value for "RESULT" or
> another environment variable which indicates whether the stopping event
> is due to a respawn. It could then be used in the following manner
> (using my foo/bar example above -- showing foo):
>
> stop on stopping bar RESULT!=respawn
>
> -or-
>
> - stop on stopping bar RESPAWN=yes
> + stop on stopping bar RESPAWN!=yes
>
> --
> You received this bug notification because you are a member of Upstart
> Developers, which is subscribed to upstart .
> https://bugs.launchpad.net/bugs/716802
>
> Title:
> respawned jobs go to "stopping" with "RESULT=ok" and no way to detect
> respawn
>
> Status in Upstart:
> New
>
> Bug description:
> Say I have two jobs: foo and bar. bar is marked "respawn" and
> provides some service used by 'foo'. foo is marked "stop on stopping
> bar" so that it will stop when bar is no longer available. bar does
> not maintain any state, so that if it crashes, it can safely be
> respawned without effecting the functionality of foo.
>
> The problem is that I need a way h...

Read more...

Matt Cowell (matthew-cowell) wrote :
Download full text (6.2 KiB)

That doesn't work for doing what I am looking for. PROCESS is not set, as
respawn jobs are not marked as failed until the respawn limit is hit. I
always get "RESULT=ok" for "stopping" on a crash of a process which is
respawned. As far as I can tell, this code doesn't set it to failed unless
the respawn limit is hit:
http://bazaar.launchpad.net/~canonical-scott/upstart/trunk/view/head:/init/job_process.c#L1071

Also, from the documentation, PROCESS=respawn occurs "to indicate that the
job is stopping because it hit the respawn limit." I'm not looking to
detect the respawn limit here, just a single respawn.

Thanks,
-Matt

On Thu, Feb 10, 2011 at 19:33, Scott James Remnant <
<email address hidden>> wrote:

> Respawn jobs have:
>
> stop on stopping bar PROCESS=respawn
>
> did this not work for you?
>
> Scott
>
> On Thu, Feb 10, 2011 at 4:51 PM, Matt Cowell
> <email address hidden>wrote:
>
> > Public bug reported:
> >
> > Say I have two jobs: foo and bar. bar is marked "respawn" and provides
> > some service used by 'foo'. foo is marked "stop on stopping bar" so
> > that it will stop when bar is no longer available. bar does not
> > maintain any state, so that if it crashes, it can safely be respawned
> > without effecting the functionality of foo.
> >
> > The problem is that I need a way here to determine if 'bar' crashed and
> > was respawned, or just was actually stopped using the "stop" command. A
> > job which will be respawned is not marked as failed unless it is
> > respawning too fast, and the "stopping" event is still sent with
> > "RESULT=ok", just as it would be if it were stopped using "stop" or
> > "stop on", never to come back.
> >
> > It would be beneficial to provide either another value for "RESULT" or
> > another environment variable which indicates whether the stopping event
> > is due to a respawn. It could then be used in the following manner
> > (using my foo/bar example above -- showing foo):
> >
> > stop on stopping bar RESULT!=respawn
> >
> > -or-
> >
> > stop on stopping bar RESPAWN!=yes
> >
> > ** Affects: upstart
> > Importance: Undecided
> > Status: New
> >
> > ** Description changed:
> >
> > Say I have two jobs: foo and bar. bar is marked "respawn" and provides
> > some service used by 'foo'. foo is marked "stop on stopping bar" so
> > that it will stop when bar is no longer available. bar does not
> > maintain any state, so that if it crashes, it can safely be respawned
> > without effecting the functionality of foo.
> >
> > The problem is that I need a way here to determine if 'bar' crashed and
> > was respawned, or just was actually stopped using the "stop" command. A
> > job which will be respawned is not marked as failed unless it is
> > respawning too fast, and the "stopping" event is still sent with
> > "RESULT=ok", just as it would be if it were stopped using "stop" or
> > "stop on", never to come back.
> >
> > It would be beneficial to provide either another value for "RESULT" or
> > another environment variable which indicates whether the stopping event
> > is due to a respawn. It could then be used in the following manner
> > (using my foo/bar example above...

Read more...

Scott James Remnant (scott) wrote :
Download full text (8.0 KiB)

Ah i see, sorry, I misinterpreted you.

Scott

On Thu, Feb 10, 2011 at 6:11 PM, Matt Cowell <email address hidden>wrote:

> That doesn't work for doing what I am looking for. PROCESS is not set, as
> respawn jobs are not marked as failed until the respawn limit is hit. I
> always get "RESULT=ok" for "stopping" on a crash of a process which is
> respawned. As far as I can tell, this code doesn't set it to failed unless
> the respawn limit is hit:
>
> http://bazaar.launchpad.net/~canonical-scott/upstart/trunk/view/head:/init/job_process.c#L1071
>
> Also, from the documentation, PROCESS=respawn occurs "to indicate that the
> job is stopping because it hit the respawn limit." I'm not looking to
> detect the respawn limit here, just a single respawn.
>
> Thanks,
> -Matt
>
> On Thu, Feb 10, 2011 at 19:33, Scott James Remnant <
> <email address hidden>> wrote:
>
> > Respawn jobs have:
> >
> > stop on stopping bar PROCESS=respawn
> >
> > did this not work for you?
> >
> > Scott
> >
> > On Thu, Feb 10, 2011 at 4:51 PM, Matt Cowell
> > <email address hidden>wrote:
> >
> > > Public bug reported:
> > >
> > > Say I have two jobs: foo and bar. bar is marked "respawn" and provides
> > > some service used by 'foo'. foo is marked "stop on stopping bar" so
> > > that it will stop when bar is no longer available. bar does not
> > > maintain any state, so that if it crashes, it can safely be respawned
> > > without effecting the functionality of foo.
> > >
> > > The problem is that I need a way here to determine if 'bar' crashed and
> > > was respawned, or just was actually stopped using the "stop" command.
> A
> > > job which will be respawned is not marked as failed unless it is
> > > respawning too fast, and the "stopping" event is still sent with
> > > "RESULT=ok", just as it would be if it were stopped using "stop" or
> > > "stop on", never to come back.
> > >
> > > It would be beneficial to provide either another value for "RESULT" or
> > > another environment variable which indicates whether the stopping event
> > > is due to a respawn. It could then be used in the following manner
> > > (using my foo/bar example above -- showing foo):
> > >
> > > stop on stopping bar RESULT!=respawn
> > >
> > > -or-
> > >
> > > stop on stopping bar RESPAWN!=yes
> > >
> > > ** Affects: upstart
> > > Importance: Undecided
> > > Status: New
> > >
> > > ** Description changed:
> > >
> > > Say I have two jobs: foo and bar. bar is marked "respawn" and
> provides
> > > some service used by 'foo'. foo is marked "stop on stopping bar" so
> > > that it will stop when bar is no longer available. bar does not
> > > maintain any state, so that if it crashes, it can safely be respawned
> > > without effecting the functionality of foo.
> > >
> > > The problem is that I need a way here to determine if 'bar' crashed
> and
> > > was respawned, or just was actually stopped using the "stop" command.
> A
> > > job which will be respawned is not marked as failed unless it is
> > > respawning too fast, and the "stopping" event is still sent with
> > > "RESULT=ok", just as it would be if it were stopped using "stop" or
> > > "stop on", never to ...

Read more...

So by making respawned jobs fail by removing the "failed = FALSE" line here (http://bazaar.launchpad.net/~canonical-scott/upstart/trunk/view/head:/init/job_process.c#L1086), I was able to use the following line to get the desired effect:

stop on stopping bar RESULT=ok or stopping bar PROCESS=respawn

This will stop foo when bar is stopped normally (stop, stop on, or normal exit) and it will also stop foo when bar hits the respawn limit.

Going one step further, I added a check here (http://bazaar.launchpad.net/~canonical-scott/upstart/trunk/view/head:/init/job.c#L853) for respawn class jobs which had not hit the respawn limit, and set RESULT to "respawn" instead. This allows me to write:

stop on stopping bar RESULT!=respawn

This is a little cleaner and makes more sense to me, since the job neither technically failed or exited successfully. RESULT should be neither of 'ok' or 'failed' in this case. This also provides EXIT_SIGNAL or EXIT_STATUS for respawned jobs, which is another nice to have. Only when the respawn limit is hit will RESULT be 'failed', only when bar is stopped normally (stop, stop on, or normal exit) will RESULT be 'ok', and only when bar is respawning due to an abnormal exit will RESULT be 'respawn'.

summary: - respawned jobs go to "stopping" with "RESULT=ok" and no way to detect
- respawn
+ init: stopping and stopped should indicate if job is to be respawned
Changed in upstart:
status: New → Triaged
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers