Well, if you lok at the attached console logs, you can see that cloud-init is retrying. In lucid, we retried 100 times with a increasing back off, resulting in 17 minutes or so of trying. The build scripts gave up after 5 minutes.
In natty (both to reduce the pain if cloud-init was installed outside of EC2, and because i had extreme amounts of data -- 2000+ instances that waiting did not help), I reduced that timeout significantly, but made it configurable in /etc/cloud/cloud.cfg.
From the console logs attached:
-- lucid --
cloud-init running: Sun, 27 Mar 2011 10:14:09 +0000. up 10.57 seconds
waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id
10:14:11 [ 1/100]: url error [timed out]
...
10:16:56 [31/100]: url error [timed out]
-- natty --
cloud-init start running: Wed, 30 Mar 2011 17:01:16 +0000. up 10.08 seconds
2011-03-30 17:01:18,555 - DataSourceEc2.py[WARNING]: waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id
2011-03-30 17:01:18,555 - DataSourceEc2.py[WARNING]: 17:01:18 [ 1/30]: url error [timed out]
...
011-03-30 17:03:55,735 - DataSourceEc2.py[WARNING]: 17:03:55 [30/30]: url error [timed out]
2011-03-30 17:04:01,742 - DataSourceEc2.py[CRITICAL]: giving up on md after 165 seconds
So, at the point at which the build publishing scripts gave up, cloud-init had been re-trying (after network had already come up) for 2 minutes and 35 seconds. In the natty test, it gave up after 2 minutes 45 seconds of trying.
If the metadata service isn't up 3 minutes after the instance is booted, then there is a bug in the platform, expecting an instance to wait 3 minutes is not acceptable.
Well, if you lok at the attached console logs, you can see that cloud-init is retrying. In lucid, we retried 100 times with a increasing back off, resulting in 17 minutes or so of trying. The build scripts gave up after 5 minutes.
In natty (both to reduce the pain if cloud-init was installed outside of EC2, and because i had extreme amounts of data -- 2000+ instances that waiting did not help), I reduced that timeout significantly, but made it configurable in /etc/cloud/ cloud.cfg.
From the console logs attached: 169.254. 169.254/ 2009-04- 04/meta- data/instance- id
-- lucid --
cloud-init running: Sun, 27 Mar 2011 10:14:09 +0000. up 10.57 seconds
waiting for metadata service at http://
10:14:11 [ 1/100]: url error [timed out]
...
10:16:56 [31/100]: url error [timed out]
-- natty -- py[WARNING] : waiting for metadata service at http:// 169.254. 169.254/ 2009-04- 04/meta- data/instance- id py[WARNING] : 17:01:18 [ 1/30]: url error [timed out] py[WARNING] : 17:03:55 [30/30]: url error [timed out] py[CRITICAL] : giving up on md after 165 seconds
cloud-init start running: Wed, 30 Mar 2011 17:01:16 +0000. up 10.08 seconds
2011-03-30 17:01:18,555 - DataSourceEc2.
2011-03-30 17:01:18,555 - DataSourceEc2.
...
011-03-30 17:03:55,735 - DataSourceEc2.
2011-03-30 17:04:01,742 - DataSourceEc2.
So, at the point at which the build publishing scripts gave up, cloud-init had been re-trying (after network had already come up) for 2 minutes and 35 seconds. In the natty test, it gave up after 2 minutes 45 seconds of trying.
If the metadata service isn't up 3 minutes after the instance is booted, then there is a bug in the platform, expecting an instance to wait 3 minutes is not acceptable.