Cloud installer / Cluster install hangs at reboot after install

Bug #430758 reported by Thierry Carrez on 2009-09-16
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
High
Colin Watson
Karmic
High
Colin Watson

Bug Description

Using the Cloud Installer option on the server CD, cluster mode, after reboot the system hangs at:
Enabling IP forwarding * Restarting OpenBSD Secure Shell server sshd

Logging in via ssh, it appears /etc/init.d/eucalyptus-cc is stuck in a loop.
Which is not that surprising, considering this code snippet:

i=10
while ! netstat -ln -A inet,inet6 2>/dev/null | egrep -q "^tcp6?[[:space:]]+[[:digit:]]+[[:space:]]+[[:digit:]]+[[:space:]]+[^[:space:]]+:${CLOUD_PORT:-8773}[[:space:]]"; do
    i=$(($i - 1))
    sleep 1
done

CLOUD_PORT is apparently empty, and the real port used is 8774, so this loops forever.

I'm trying to understand the maze of the initscripts since simply fixing that one doesn't appear to be sufficient to ensure seamless boot messages.

Thierry Carrez (ttx) on 2009-09-16
Changed in eucalyptus (Ubuntu):
importance: Undecided → High
Dustin Kirkland  (kirkland) wrote :

Confirming, I'm seeing the same thing from a fresh install on real hardware.

:-Dustin

Changed in eucalyptus (Ubuntu):
status: New → Confirmed
Thierry Carrez (ttx) wrote :

Changing 8773 to 8774 in all initscripts fixes this.
This is only a first-boot issue, since CLOUD_PORT gets defined afterwards.

Adding VERBOSE=yes at the top of the initscripts restores the missing start/stop messages at boot.

Changed in eucalyptus (Ubuntu):
status: Confirmed → Triaged
Thierry Carrez (ttx) wrote :

Proposed patch

Thierry Carrez (ttx) wrote :

Maybe doing ${CC_PORT:-8773} instead of ${CLOUD_PORT:-8774} is a better fix. I have a hard time figuring out what CLOUD_PORT does here.

Thierry Carrez (ttx) wrote :

Fixed by Colin's recent changes to initscripts.

Changed in eucalyptus (Ubuntu):
status: Triaged → Fix Released
Colin Watson (cjwatson) wrote :

It should match euca_conf. For cluster registration that's CC_PORT; for walrus/sc registration it's apparently hardcoded to 8773.

Colin Watson (cjwatson) wrote :

Doesn't look fixed to me ...

Changed in eucalyptus (Ubuntu):
status: Fix Released → Triaged
status: Triaged → Fix Committed
assignee: nobody → Colin Watson (cjwatson)
Thierry Carrez (ttx) wrote :

> Doesn't look fixed to me ...
It no longer went into the infinite loop, thanks to your change at rev554... now it should be even faster.

Thierry Carrez (ttx) on 2009-09-18
Changed in eucalyptus (Ubuntu):
milestone: none → ubuntu-9.10-beta
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6~bzr808-0ubuntu1

---------------
eucalyptus (1.6~bzr808-0ubuntu1) karmic; urgency=low

  [ Dustin Kirkland ]
  * debian/eucalyptus-udeb.finish-install: eth0 should be set to
    'manual', when configured with br0 on dhcp, LP: #430820
  * tools/euca_conf.in: ensure that /var/run/eucalyptus and
    /var/run/eucalyptus/net are created at boot and have correct
    ownerships, LP: #431114, #365349

  [ Thierry Carrez ]
  * cluster/Makefile, node/Makefile: Do not patch generated stubs if you
    didn't regenerate them, to avoid spurious build interruptions.
  * tools/eucalyptus-*.in: Do not guard initscripts basic output
    messages with VERBOSE != no (LP: #431274)
  * debian/control: Have eucalyptus-cc suggest vtun for full multi-cluster
    networking capabilities (LP: #425928)

  [ Colin Watson ]
  * Align ports used for cloud startup detection in init scripts with the
    corresponding code in euca_conf (LP: #430758).

  [ Soren Hansen ]
  * New snapshot.
  * Add a build-dependency on libc3p0-java.

 -- Soren Hansen <email address hidden> Mon, 21 Sep 2009 12:14:12 +0200

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → Fix Released
tags: added: iso-testing
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Patches