init: job.c: 283: Assertion failed in job_changed_state: job->blocker == NULL

Bug #406408 reported by Emil Renner Berthing
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
upstart
Fix Released
Critical
Scott James Remnant (Canonical)

Bug Description

Hi

This can happen when using expect daemon when expect fork should have been used.
As a real world example take the following two .conf files

dbus-system.conf:

start on starting hald
stop on starting shutdown
oom -15
respawn
expect fork
exec dbus-daemon --system --fork
post-stop script
  rm -f /var/run/dbus.pid || true
end script

hald.conf (expect daemon should be expect fork here):

start on starting slim
stop on stopping dbus-system
respawn
expect daemon
exec hald --daemon=yes

Typing 'start hald' starts the dbus-system and then hald jobs but upstart gets the pid of hald wrong.
Now 'stop dbus-system' will hang and after breaking initctl status says
  dbus-system stop/stopping, process xxx
  hald stop/killed, process yyy
and both jobs are stuck.
Finally 'killl xxx' will kill the dbus-daemon and trigger the assert:

init: dbus-system main process ended, respawning
init: job.c: 283: Assertion failed in job_changed_state: job->blocker == NULL
init: Caught abort, core dumped
Kernel panic - not syncing..

/Emil Renner Berthing

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

This sounds similar to bug #387216, just following an entirely different code path; there should be no way to "escape" such that blocker is NULL, but it's clearly happening

Changed in upstart:
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Actually the bug is far simpler than use of "expect".

If the main process dies while the job is in the "stopping" state (stopping event pending), the state is changed in the terminated handler - and thus you hit that assertion.

In the stopping state, process exits should be ignored because we're stopping anyway (just like we ignore failures in the killed state).

Changed in upstart:
milestone: none → 0.6.3
status: Confirmed → Triaged
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :
Changed in upstart:
status: Triaged → Fix Committed
Changed in upstart:
assignee: nobody → Scott James Remnant (scott)
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

The fix for this bug was released in Upstart 0.6.3

0.6.3 2009-08-02 "Our last, best hope for peace"

 * Fixed an assertion when a job's main process is terminated
   while in the stopping state. (Bug: #406408)

 * Fixed compilation on ia64.

 * nih-dbus-tool(1) manpage will not be installed, since the binary
   is not. (Bug: #403103)

Changed in upstart:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.