cloud-init doesn't start when no data source is present

Bug #1892171 reported by Hari Sundararajan
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Wishlist
Unassigned

Bug Description

I am using KVM (via libvirt)

1. I take bionic-server-cloudimg-amd64.img from cloud-images.ubuntu.com

2. I create a simple user-data file
----
#cloud-config

users:
        - default
        - name: ubuntu
          ssh-authorized-keys:
                  - ssh-rsa <contents of id_rsa.pub>
----

3. I create the seed ISO.
----
cloud-localds --hostname TEMPLATE --verbose seed.iso user-data
----

4. I boot the VM. Within the VM I run some provisioning ...
----
cloud-init status --wait
# Some provisioning here
apt-get install some devel libraries
truncate -s 0 /etc/machine-id
rm -rf /etc/netplan/*
rm -rf /etc/ssh/ssh_host_*
cloud-init clean --seed --logs
----

5. I shut down the VM. This, in theory, should become my template from which I should be able to clone new VMs, correct? However, when I clone and boot it, cloud-init does not run on the clones at all.

In /run/cloud-init/cloud.cfg I see
----
di_report:
  datasource_list: [ ]
  # reporting not found result. notfound=disabled
----

In /run/cloud-init/cloud-init-generator.log I see (among other things)
----
ds-identify rc=1
ds-identify_RET=notfound
cloud-init is enabled but no datasource found. disabling
----

Finally, in /run/cloud-init/ds-identify.log I see (among other things)
----
DSNAME=
DSLIST=NoCloud ConfigDrive OpenNebula DigitalOcean Azure AltCloud OVF MAAS GCE OpenStack CloudSigma SmartOS Bigstep Scaleway AliYun Ec2 CloudStack Hetzner IBMCloud Oracle Exoscale None
Mode=search
is_container=false
is_ds_enabled(IBMCloud) = true
ec2 platform is 'Unknown'
No ds found [mode=search,notfound=disabled]. Disabled cloud-init [1]
----

DSLIST in the above matches what is in /etc/cloud/cloud.cfg.d/90_dpkg.cfg but it is not recognized at all.

However, as mentioned in https://bugs.launchpad.net/cloud-init/+bug/1876375 , if I do this instead

----
echo 'datasource_list: [ NoCloud, None ]' > /etc/cloud/cloud.cfg.d/90_dpkg.cfg
---

Then everything works just as I want it. The clones boot up properly, there's no need to attach any seed ISO file, we are able to login.

Why is it that when 90_dpkg.cfg contains [ NoCloud, None]
the cloning process works and cloud-init starts, but when 90_dpkg.cfg has a long list of entries (all of which are irrelevant) , it chooses to not even start?

(NOTE: This is on Ubuntu 18, /usr/bin/cloud-init 20.2-45-g5f7825e2-0ubuntu1~18.04.1 from some daily build)

Revision history for this message
Ryan Harper (raharper) wrote :

Hi Hari,

Thanks for filing a bug.

> . I boot the VM. Within the VM I run some provisioning ...

Can you provide more details on how you're booting the VM? I ask because cloud-init is designed to *not* run unless it detects a datasource. The NoCloud datasource is detected in a few ways;

1) filesystem label found with 'cidata' or 'CIDATA'
2) DMI Product Serial includes ds=nocloud*
3) /var/lib/cloud/seed/nocloud-* directory exists

> cloud-init clean --seed --logs

This operation will *remove* any seeds from /var/lib/cloud/seed/* in your image. If this was how you were telling cloud-init to run; you've now removed it and future boots of this image will not run cloud-init as you've not provided a datasource that will activate cloud-init.

> Why is it that when 90_dpkg.cfg contains [ NoCloud, None]
the cloning process works and cloud-init starts, but when 90_dpkg.cfg has a long list of entries (all of which are irrelevant) , it chooses to not even start?

cloud-init's reads /etc/cloud/cloud.cfg and /etc/cloud/cloud.cfg.d/*.cfg; if in those config files something sets the datasource_list to a single datasource (like you did); then cloud-init assumes that someone has configured a specific datasource and will always activate.

The goal for cloud-init is to allow an image to be re-used on any number of platforms; so if you've customized an image using NoCloud; if you take this image and booted it on Ec2 or Azure; it should work there (but will use the correct platform datasource rather than NoCloud).

I'm going to mark this bug invalid as it appears that cloud-init is working as designed but please change it back to New if you believe that cloud-init is not working as designed.

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Hari Sundararajan (hsunda3) wrote :

> Can you provide more details on how you're booting the VM?
> I ask because cloud-init is designed to *not* run unless it detects a datasource

I create an image that is expected to boot in 3 locations.

Location A: Openstack
Location B: An environment with CIDATA (NoCloud)
Location C: An environment where there is no data source whatever. No kernel command line references to cloud-init, not CIDATA filesystem, no SMBIOS modification... nothing.

In all 3 cases, I want cloud-init to activate, because I want some cloud-init functionality (ssh key generation, DHCP request on first network interface, file system increase / growpart and so on).

Obviously, in the OpenStack environment (Location A) I want it to honor the OpenStack data source and in the Location . In Location B, I want cloud-init to honor CIDATA. In Location C, I want it to get triggered and run its functionality, but without a data source.

Is there a way to achieve this? I thought baking /etc/cloud/cloud.cfg.d/90_dpkg.cfg with

   datasource_list: [ NoCloud, OpenStack, None ]

Would do this. It doesn't. Is that expected behavior?

Revision history for this message
Ryan Harper (raharper) wrote :

> I create an image that is expected to boot in 3 locations.
...
> Is there a way to achieve this?

Yes. You may want to add ConfigDrive to your ds list (OpenStack's may use ConfigDrive as well).

And if you always want cloud-init to run then you create:

/etc/cloud/ds-identify.cfg with content

policy: enabled

Which will enable cloud-init always. You're changes to datasource_list will tell cloud-init to only look for those specific datasources.

Revision history for this message
Scott Moser (smoser) wrote :

ds-identify.cfg is undocumented by design.

I think that the None datasource was really a bad idea. thanks to ds-identify, it doesn't come into play except in error cases now.

If we were in the need of supporting somethign like Hari is after, I'd suggest a new datasource that didn't spew warnings about "something must have gone wrong". And I would not enable it by default in cloud-init. Then, if enabled explicitly it could be last in the order.

Even then, though it would be difficult as in order to generate networking configuration ("dhcp on eth0") it would need to run at the local stage. I think to accomplish it, we'd have to have all datasources moved to local.

Revision history for this message
Hari Sundararajan (hsunda3) wrote :

Perfect, thank you for the explanation. Apologies for the churn, my understanding of the documentation was wrong.

policy:enabled indeed lets cloud-init run everywhere, and even though in my environment with no data source, it spins a while trying to access AWS end points, it still runs so that satisfies my requirements.

thank you!

Revision history for this message
Paride Legovini (paride) wrote :

Thanks Hari, Scott and Ryan. I'm happy that Hari has a working solution, but the discussion above has useful pointer on things we can improve (see Scott's last comment), and I think the documentation can be improved too. I'm marking this bug as Triaged with low importance, so hopefully we won't lose track of it.

Changed in cloud-init:
status: Incomplete → Triaged
importance: Undecided → Wishlist
Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 1892171] Re: cloud-init doesn't start when no data source is present

I'd just like to explicitly say that documenting ds-identify.cfg is
*not* desirable . ds-identify does not have stable and user-modifyable
confgiuration.

On Fri, Aug 21, 2020 at 6:50 AM Paride Legovini
<email address hidden> wrote:
>
> Thanks Hari, Scott and Ryan. I'm happy that Hari has a working solution,
> but the discussion above has useful pointer on things we can improve
> (see Scott's last comment), and I think the documentation can be
> improved too. I'm marking this bug as Triaged with low importance, so
> hopefully we won't lose track of it.
>
> ** Changed in: cloud-init
> Status: Incomplete => Triaged
>
> ** Changed in: cloud-init
> Importance: Undecided => Wishlist
>
> --
> You received this bug notification because you are subscribed to cloud-
> init.
> https://bugs.launchpad.net/bugs/1892171
>
> Title:
> cloud-init doesn't start when no data source is present
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-init/+bug/1892171/+subscriptions

Revision history for this message
Hari Sundararajan (hsunda3) wrote :

> I'd suggest a new datasource that didn't spew warnings about "something must have gone wrong". And I would not enable it by default in cloud-init. Then, if enabled explicitly it could be last in the order.

To be honest, this was my original thought process. I thought by including "None" in /etc/cloud/cloud.cfg.d/90_dpkg.cfg, the "None" data source gets "enabled" and runs upon detecting nothing else. Also, https://cloudinit.readthedocs.io/en/latest/topics/datasources/fallback.html says "This is the fallback datasource when no other datasource can be selected" .

I am not familiar with the inner workings, so I can not comment on the challenges involved with this (or address your comments like "having to have all datasources moved to local") , but given all the useful things cloud-init does, I would definitely appreciate a data source that as though a empty string was given as user data in the absence of anything

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.