Failure to autoregister components on first reboot after install

Bug #445294 reported by Thierry Carrez on 2009-10-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
High
Thierry Carrez

Bug Description

CD install from 20091007 daily, eucalyptus-1.6~bzr912-0ubuntu2

The components failed to autoregister on first reboot after initial CD install.
The /var/log/eucalyptus/*registration.log only show lines like:
ERROR: you need to be on the CLC host and the CLC needs to be running.

Rebooting a second time fixed that.

Looking at the /var/log/eucalyptus/cloud-*.log, it appears that it hangs trying to boot the admin interface, as the last message in those logs before reboot is:
INFO SystemBootstrapper | =====================================
                                         | Starting Final
                                         | =====================================
INFO SystemBootstrapper | -> start: class com.eucalyptus.bootstrap.HttpServerBootstrapper
INFO HttpServerBootstrapper | Starting admin interface.

and then only silence.
I don't think it's a regression in 1.6~bzr912-0ubuntu2, but rather some instability. Some installs would succeed while others would fail.

Thierry Carrez (ttx) wrote :

i'll try a new install to confirm.

Changed in eucalyptus (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Thierry Carrez (ttx) wrote :

It worked on my retry.

Timeline:
14:06:11 System startup
14:06:12 DHCPDISCOVER
14:06:15 DHCPACK
14:06:37 Port 9001 shows up on netstat (port 8774 is already up)
14:06:49 Port 8773 shows up on netstat
14:06:49 "starting admin interface"
14:07:01 Port 8443 shows up on netstat
14:07:12 Various eucalyptus-* post-start processes terminated with status 1 (bug 445361)
14:07:13 Registration requests are received (and succeed)

The poststart scripts that fail at 14:07:12 after 60 tries with a sleep 1 means that those jobs are started around 14:06:12. If port 8443 still can't accept requests at 14:07:13, then registration would fail. I guess our timeout is a little too close to the limit, especially on first boot, let's increase it.

Changed in eucalyptus (Ubuntu):
assignee: nobody → Thierry Carrez (ttx)
status: Incomplete → In Progress
Thierry Carrez (ttx) on 2009-10-07
Changed in eucalyptus (Ubuntu):
status: In Progress → Fix Committed
Changed in eucalyptus (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers