Failure to autoregister components on first reboot after install

Bug #445294 reported by Thierry Carrez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
Fix Released
High
Thierry Carrez

Bug Description

CD install from 20091007 daily, eucalyptus-1.6~bzr912-0ubuntu2

The components failed to autoregister on first reboot after initial CD install.
The /var/log/eucalyptus/*registration.log only show lines like:
ERROR: you need to be on the CLC host and the CLC needs to be running.

Rebooting a second time fixed that.

Looking at the /var/log/eucalyptus/cloud-*.log, it appears that it hangs trying to boot the admin interface, as the last message in those logs before reboot is:
INFO SystemBootstrapper | =====================================
                                         | Starting Final
                                         | =====================================
INFO SystemBootstrapper | -> start: class com.eucalyptus.bootstrap.HttpServerBootstrapper
INFO HttpServerBootstrapper | Starting admin interface.

and then only silence.
I don't think it's a regression in 1.6~bzr912-0ubuntu2, but rather some instability. Some installs would succeed while others would fail.

Revision history for this message
Thierry Carrez (ttx) wrote :

i'll try a new install to confirm.

Changed in eucalyptus (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Thierry Carrez (ttx) wrote :

It worked on my retry.

Timeline:
14:06:11 System startup
14:06:12 DHCPDISCOVER
14:06:15 DHCPACK
14:06:37 Port 9001 shows up on netstat (port 8774 is already up)
14:06:49 Port 8773 shows up on netstat
14:06:49 "starting admin interface"
14:07:01 Port 8443 shows up on netstat
14:07:12 Various eucalyptus-* post-start processes terminated with status 1 (bug 445361)
14:07:13 Registration requests are received (and succeed)

The poststart scripts that fail at 14:07:12 after 60 tries with a sleep 1 means that those jobs are started around 14:06:12. If port 8443 still can't accept requests at 14:07:13, then registration would fail. I guess our timeout is a little too close to the limit, especially on first boot, let's increase it.

Changed in eucalyptus (Ubuntu):
assignee: nobody → Thierry Carrez (ttx)
status: Incomplete → In Progress
Thierry Carrez (ttx)
Changed in eucalyptus (Ubuntu):
status: In Progress → Fix Committed
Changed in eucalyptus (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.