no error message to console when cloud-config-url fails to load

Bug #1303934 reported by James Troup on 2014-04-07
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
cloud-init
Medium
Unassigned
cloud-init (Ubuntu)
Medium
Unassigned

Bug Description

When booting an Ubuntu 12.04 ephmeral into MAAS commissioning on a
node which was unable to reach the region controller, I got the
following traceback:

| Can not apply stage final, no datasource found! Likely bad things to come!
| ------------------------------------------------------------
| Traceback (most recent call last):
| File "/usr/bin/cloud-init", line 315, in main_modules
| init.fetch()
| File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 302, in fetch
| return self._get_data_source()
| File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 234, in _get_data_source
| pkg_list)
| File "/usr/lib/python2.7/dist-packages/cloudinit/sources/__init__.py", line 212, in find_source
| raise DataSourceNotFoundException(msg)
| DataSourceNotFoundException: Did not find any data source, searched classes: ()
| ------------------------------------------------------------

Fuller log attached.

James Troup (elmo) wrote :
Scott Moser (smoser) wrote :

I have to mark this as fix-released. I'm not really sure how you could have seen the output you saw, and I just verified that cloud-init in trusty gives a saner message. Heres how:
 sudo dpkg-reconfigure cloud-init # select 'Azure' on an openstack instance, something that iwll not be found
 sudo cloud-init init --local
 sudo cloud-init init

this shows on stderr:
 2014-06-03 13:28:56,208 - util.py[WARNING]: No instance datasource found! Likely bad things to come!
the /var/log/cloud-init.log shows:

Jun 3 13:27:10 inst-trusty-20140602-191433 [CLOUDINIT] util.py[DEBUG]: No instance datasource found! Likely bad things to come!#012Traceback (most recent call last):#012 File "/usr/bin/cloud-init", line 242, in main_init#012 init.fetch()#012 File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 308, in fetch#012 return self._get_data_source()#012 File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 236, in _get_data_source#012 pkg_list)#012 File "/usr/lib/python2.7/dist-packages/cloudinit/sources/__init__.py", line 260, in find_source#012 raise DataSourceNotFoundException(msg)#012DataSourceNotFoundException: Did not find any data source, searched classes: (DataSourceAzureNet)

See here, the 'searched classes' is not empty.

The other part of this bug (what I actually think happened) is that cloud-init did not sanely warn that it tried to read something from the cloud_config_url and was unable to get it.

I'm just going to re-purpose this bug to that.

Changed in cloud-init (Ubuntu):
importance: Undecided → Medium
status: New → Fix Released
status: Fix Released → Confirmed
Changed in cloud-init:
importance: Undecided → Medium
status: New → Confirmed
summary: - traceback when unable to reach metadata server
+ no error message when cloud_config_url fails to load
summary: - no error message when cloud_config_url fails to load
+ no error message to console when cloud-config-url fails to load
Scott Moser (smoser) wrote :

The fix as I see it is just that we need to scream fairly loudly t read_write_cmdline_url
if there was a cloud-config-url on the kernel command line, and we failed to get it, then this is bad news and we should do everythign we can to alert the user.

mahmoh (mahmoh) wrote :
Download full text (3.9 KiB)

I think I just hit this problem with MAAS 1.7.2 with a trusty virtual setup on my laptop, I'll hopefully keep this for a few days to test a Chef bug so let me know if you need any more information:

ubuntu@maas-node-1:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.2 LTS
Release: 14.04
Codename: trusty
ubuntu@maas-node-1:~$ dpkg -l | grep cloud-init
ii cloud-init 0.7.5-0ubuntu1.5 all Init scripts for cloud instances
ii cloud-initramfs-dyn-netconf 0.25ubuntu1 all write a network interface file in /run for BOOTIF
ubuntu@maas-node-1:~$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.13.0-35-generic root=UUID=ebef096d-3483-4027-abf4-d0621f6f7144 ro
ubuntu@maas-node-1:~$ tail -20 /var/log/cloud-init.log
2015-05-31 17:19:17,681 - url_helper.py[DEBUG]: Please wait 5 seconds while we wait to try again
2015-05-31 17:19:22,685 - url_helper.py[DEBUG]: [0/1] open 'http://192.168.101.2/MAAS/metadata//2012-03-01/meta-data/instance-id' with {'url': 'http://192.168.101.2/MAAS/metadata//2012-03-01/meta-data/instance-id', 'headers': {'Authorization': 'OAuth realm="", oauth_nonce="26709034", oauth_timestamp="1433092762", oauth_consumer_key="uU9KHB7kPK24yZcFrx", oauth_signature_method="PLAINTEXT", oauth_version="1.0", oauth_token="gwF4Eh6beNEqXcg3f6", oauth_signature="%262bj652WEP7dp3NHKxsHZVRaFmhCeUv79"'}, 'allow_redirects': True, 'method': 'GET', 'timeout': 4.0} configuration
2015-05-31 17:19:25,685 - url_helper.py[DEBUG]: Calling 'http://192.168.101.2/MAAS/metadata//2012-03-01/meta-data/instance-id' failed [118/120s]: request error [HTTPConnectionPool(host='192.168.101.2', port=80): Max retries exceeded with url: /MAAS/metadata//2012-03-01/meta-data/instance-id (Caused by <class 'socket.error'>: [Errno 113] No route to host)]
2015-05-31 17:19:25,685 - url_helper.py[DEBUG]: Please wait 5 seconds while we wait to try again
2015-05-31 17:19:30,690 - DataSourceMAAS.py[CRITICAL]: Giving up on md from ['http://192.168.101.2/MAAS/metadata//2012-03-01/meta-data/instance-id'] after 123 seconds
2015-05-31 17:19:30,697 - util.py[WARNING]: No instance datasource found! Likely bad things to come!
2015-05-31 17:19:30,697 - util.py[DEBUG]: No instance datasource found! Likely bad things to come!
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 242, in main_init
    init.fetch()
  File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 308, in fetch
    return self._get_data_source()
  File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 236, in _get_data_source
    pkg_list)
  File "/usr/lib/python2.7/dist-packages/cloudinit/sources/__init__.py", line 260, in find_source
    raise DataSourceNotFoundException(msg)
DataSourceNotFoundException: Did not find any data source, searched classes: (DataSourceMAAS)
2015-05-31 17:19:30,702 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
2015-05-31 17:19:30,703 - util.py[DEBUG]: Read 14 bytes from /proc/uptime
2015-05-31 17:19:30,703 - util.py[DEBUG]: cloud-init mode 'init' took 124.041 seconds (124.04)
ubuntu@maas-node-1:/var/lib/cloud/data$ cat...

Read more...

mahmoh (mahmoh) wrote :

Upon further investigation, this is going to seem odd but I had two disks per VM (for Autopilot) of differing types (for testing) and the Deployed root disk was set to sdb, whereas sda was used for the initial Deploingt target (new/different bug?)(note: I renamed the nodes vs. above to avoid confusion for me but these are the same nodes as above):

Deploying:

overlayroot on / type overlayfs (rw,lowerdir=/media/root-ro/,upperdir=/media/root-rw/overlay)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
/dev/sdc on /media/root-ro type ext4 (ro)
tmpfs-root on /media/root-rw type tmpfs (rw,relatime)
none on /sys/fs/pstore type pstore (rw)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
/dev/sda1 on /tmp/tmpHpNIFU/target type ext4 (rw)
/dev on /tmp/tmpHpNIFU/target/dev type none (rw,bind)
/proc on /tmp/tmpHpNIFU/target/proc type none (rw,bind)
/sys on /tmp/tmpHpNIFU/target/sys type none (rw,bind)

Deployed:

/dev/sdb1 on / type ext4 (rw)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
overflow on /tmp type tmpfs (rw,size=1048576,mode=1777)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)

So I'm unsure if the renaming fixed it and/or the disk reconfiguration (two identical SATA emulations) but now it works.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments