Upstart publication scripts no longer run

Bug #503850 reported by Thierry Carrez on 2010-01-06
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
High
Thierry Carrez
Lucid
High
Thierry Carrez

Bug Description

20100106.2 / eucalyptus 1.6.2~bzr1120-0ubuntu3
The publication upstart jobs have been recently switched to:

start on started ssh and started avahi-daemon
stop on stopping ssh or stopping avahi-daemon

However what happens is:
lo is up
ssh and avahi-daemon start on lo
publication job starts
eth0 comes up
ssh stops
publication job stops (by design)
ssh respawns

But publication job doesn't start again when ssh is started.
You end up with a system where ssh is started, avahi-daemon is started, but the publication job isn't.

Starting on eth0 up would solve the boot issue, but we would still have issues when ssh is stopped and restarted.

Thierry Carrez (ttx) on 2010-01-06
Changed in eucalyptus (Ubuntu):
assignee: nobody → Thierry Carrez (ttx)
importance: Undecided → High
status: New → Triaged
Changed in eucalyptus (Ubuntu Lucid):
milestone: none → lucid-alpha-2
Thierry Carrez (ttx) wrote :

<slangasek> ttx: because of The Upstart Bug
<ttx> slangasek: the TUB ?
<slangasek> ttx: you would have to also restart avahi-daemon in order for upstart to see again that the second half of the condition is satisfied

Let's workaround it by not stopping, and respawning the publication job.

Changed in eucalyptus (Ubuntu Lucid):
milestone: lucid-alpha-2 → none
status: Triaged → In Progress
Thierry Carrez (ttx) wrote :

Working around The Upstart Bug, the publication upstart jobs now switched to:

start on started ssh and started avahi-daemon and net-device-up IFACE=eth0

That results in:
lo is up
ssh and avahi-daemon start on lo
eth0 comes up
ssh and avahi-daemon restart
publication job starts

It will still fail if avahi-daemon is stopped (the publication job will be stopped and not restarted), but tha's already the case in 9.10.

Changed in eucalyptus (Ubuntu Lucid):
status: In Progress → Fix Committed

On Wed, Jan 06, 2010 at 04:33:33PM -0000, Thierry Carrez wrote:
> It will still fail if avahi-daemon is stopped (the publication job will
> be stopped and not restarted), but tha's already the case in 9.10.
>

I think you need to have avahi-daemon running no matter what in order to have
avahi-publish working correctly. If avahi-daemon is stopped, the avahi-publish
job would be failing all the time.

--
Mathias Gug
Ubuntu Developer http://www.ubuntu.com

Mathias Gug (mathiaz) wrote :

In the same area, none of the -publication jobs are run after a package install. Only the eucalyptus-cloud-publication works as expected - it starts on started eucalyptus-cloud and doesn't depend on neither ssh nor avahi-daemon.

So I'm not sure that start on started ssh and started avahi-daemon and net-device-up IFACE=eth0 fixes that problem. Why not start the -publication jobs once their service counterparts are started (like in eucalyptus-cloud)?

Mathias Gug (mathiaz) wrote :

Another solution could be to drop the --noscripts option from dh_installinit in debian/rules so that relevant upstart jobs are actually started on package installation.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2~bzr1120-0ubuntu4

---------------
eucalyptus (1.6.2~bzr1120-0ubuntu4) lucid; urgency=low

  [ Thierry Carrez ]
  * debian/*publication.upstart: Start publication jobs when eth0 is up, and
    never stop them to work around The Upstart Bug (LP: #503850)

  [ Dustin Kirkland ]
  * debian/control, debian/eucalyptus-nc.upstart: (LP: #446036, #452572)
    - add a versioned depends for eucalyptus-nc on a new version
      of libvirt-bin that starts using upstart
    - start eucalyptus-nc on started libvirt-bin
 -- Dustin Kirkland <email address hidden> Wed, 06 Jan 2010 19:16:01 -0600

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Thierry Carrez (ttx) wrote :

@Mathias:
For SC/CC/Walrus/NC, the publication has to be done when "the package is installed and ssh is started" (see autoregistration spec). An SC, for example, will not fully start until credentials have been synced. On installs where SC is separated from CLC, this won't happen until registration. And registration won't happen until publication :)

Comment #5 is probably a good idea, could you open a new bug to track the "publication won't happen on package install" issue ? I'm not 100% sure we /need/ to fix that given that autoregistration will then fail (missing parent key in downstream authorized_keys). The autoregistration spec for alpha2 targets the UEC installer, then we need to see what can be done to improve install-from-packages.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers