Assertion causing kernel panic with respawn stanza and post-start/pre-stop

Reported by Sandeep on 2009-05-27
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
upstart
High
Scott James Remnant (Canonical)
0.3
High
Scott James Remnant (Canonical)
0.5
High
Scott James Remnant (Canonical)
Trunk
High
Scott James Remnant (Canonical)
upstart (Ubuntu)
High
Scott James Remnant (Canonical)

Bug Description

Hi,
We are using Upstart version 0.5.0.
If a job has a respawn stanza and the main process terminates before
post-start script ends, there is a assertion which is causing a kernel
panic.
This does not cause a assertion if the respawn stanza is not present.
Sample Job file:
[/nobackup/spuddupa/ppcroot/nova/etc/init/jobs.d]$ cat job1
console output
respawn
respawn limit 3 20
normal exit 0

pre-start script
  echo "Printing from pre-start script"
end script
script
  echo "Printing from main script"
  # die.sh will sleep for 2 seconds and exit 100
  # The idea is that main process dies before post-start completes.
  exec /etc/init/scripts/die.sh 100 2
end script
post-start script
  echo "Printing from post-start script"
  sleep 5
end script
pre-stop script
   echo "Printing from pre-stop script"
end script
post-stop script
    echo "Printing from post-stop script"
end script

[/nobackup/spuddupa/ppcroot/nova/etc/init/scripts]$ cat die.sh
#!/bin/bash
if [ $# -lt 2 ] ;then
    echo "Correct syntax is $0 RETCODE SLEEPTIME"
    exit 1
fi
retcode=$1
sleeptime=$2
sleep $sleeptime
exit $retcode

Output from /var/log/messages and console are attached
I am working on a fix for this problem. I will send out the patch for review.

Sandeep (spuddupa) wrote :

This is happening because the respawn handling code is before the checks for a running post-start script in job_process_terminated()

Moving the checks for JOB_POST_START and JOB_PRE_STOP above the respawn checking would be sufficient to avoid the assert()

Changed in upstart:
importance: Undecided → High
status: New → Triaged
milestone: none → 0.5.2

It's not quite that simple, sadly;

if we just move the code up, then it avoids the assert. However the job will now fail to respawn because when the post-start job exits, there's nothing to say the job needs to be restarted.

summary: - Upstart 0.5.0. Assertion causing kernel panic with respawn stanza
+ Assertion causing kernel panic with respawn stanza and post-start/pre-
+ stop

The fix for 0.3 has been released in 0.3.10

Changed in upstart (Ubuntu):
assignee: nobody → Scott James Remnant (scott)
importance: Undecided → High
status: New → Triaged
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 0.3.10-1

---------------
upstart (0.3.10-1) karmic; urgency=low

  * Compilation fixes.
  * Fixed assertion caused by the post-start or pre-stop scripts
    exiting after the main process of a respawning job had exited.
    LP: #381048.

 -- Scott James Remnant <email address hidden> Wed, 17 Jun 2009 13:33:40 +0100

Changed in upstart (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments