Comment 7 for bug 745930

Revision history for this message
Scott Moser (smoser) wrote :

Well, if you lok at the attached console logs, you can see that cloud-init is retrying. In lucid, we retried 100 times with a increasing back off, resulting in 17 minutes or so of trying. The build scripts gave up after 5 minutes.

In natty (both to reduce the pain if cloud-init was installed outside of EC2, and because i had extreme amounts of data -- 2000+ instances that waiting did not help), I reduced that timeout significantly, but made it configurable in /etc/cloud/cloud.cfg.

From the console logs attached:
-- lucid --
  cloud-init running: Sun, 27 Mar 2011 10:14:09 +0000. up 10.57 seconds
  waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id
  10:14:11 [ 1/100]: url error [timed out]
  ...
  10:16:56 [31/100]: url error [timed out]

-- natty --
  cloud-init start running: Wed, 30 Mar 2011 17:01:16 +0000. up 10.08 seconds
  2011-03-30 17:01:18,555 - DataSourceEc2.py[WARNING]: waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id
  2011-03-30 17:01:18,555 - DataSourceEc2.py[WARNING]: 17:01:18 [ 1/30]: url error [timed out]
  ...
  011-03-30 17:03:55,735 - DataSourceEc2.py[WARNING]: 17:03:55 [30/30]: url error [timed out]
  2011-03-30 17:04:01,742 - DataSourceEc2.py[CRITICAL]: giving up on md after 165 seconds

So, at the point at which the build publishing scripts gave up, cloud-init had been re-trying (after network had already come up) for 2 minutes and 35 seconds. In the natty test, it gave up after 2 minutes 45 seconds of trying.

If the metadata service isn't up 3 minutes after the instance is booted, then there is a bug in the platform, expecting an instance to wait 3 minutes is not acceptable.