Comment 14 for bug 66002

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ok, so here's my current working theory on this bug.

When you are in single-user mode, the rcS-default job is running: this is what runs "sulogin". Once "sulogin" finishes, this then runs "telinit" to switch to the default runlevel.

If you type "reboot" inside the "sulogin" shell, that will run "shutdown" which will send a runlevel event to change the runlevel (to 6).

Now, what's supposed to happen is this:

 * the rcS-sulogin job is stopped (stop on runlevel)
 * the rcS-sulogin script, sulogin, etc. are sent the TERM signal and die
 * the rc6 job is started (start on runlevel 6)
 * shutdown begins

When I try it, that's what happens.

Now, what I think *MIGHT* happen for some people is:

 * the rcS-sulogin job is stopped (stop on runlevel)
 * the sulogin script gets sent the TERM signal
 * the rcS-sulogin script continues, and gets to run "telinit 2"
 * the rcS-sulogin script gets send the TERM signal
 * _but_ telinit2 sends the "runlevel 2" event
 * the rc6 job is stopped (stop on runlevel [!6])
 * the rc2 job is started (start on runlevel 2)
 * normal boot begins

So this would be a race condition.

0.3 gives us an easy way to fix this, the job stop cause is in $UPSTART_EVENT for the post-stop script, so all we need to do is move the telinit into post-stop and only run if if $UPSTART_EVENT != runlevel

Interestingly this is impossible with 0.5 because it doesn't reveal why a process was stopped, and this makes we realise that we do need that functionality