On GCE, ds.metadata.instance_data cannot be used for templating

Bug #1834371 reported by shine
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Wishlist
shine

Bug Description

When running on GCE, the templating engine has a ds.metadata.instance_data attribute available to it. However, that attribute is a string representation of a dictionary (of GCE-specific metadata), rather than a dictionary. This means that the contents of instance_data cannot readily be used in templating.

This is a feature request for instance_data to be presented as a dict rather than as an opaque string.

[Original Report]

I have a use-case where I need to query the metadata that was passed on creating the instance and use those values as variables in my cloud-config.

However, ds.meta_data.instance_data is a string, hence making it not query-able from cloud-config. Just removing the enclosing quotes and the escaping around the quotes within the actual dictionary itself makes the instance_data field easily query-able.

Please consider making the instance_data field a query-able field?

Environment Information :
Operating System : Ubuntu 18.04.2 LTS
Cloud Init : 19.1-1-gbaa47854-0ubuntu1~18.04.1

Revision history for this message
Scott Moser (smoser) wrote :

Can you please run 'cloud-init collect-logs' and attach the output?
That will give more info on your request.
Then set bug back to 'New'.

Thanks.

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Chad Smith (chad.smith) wrote :

The output of 'cloud-init query ds.meta_data' on that system would help here too. Please check that content for secrets before posting it.

Revision history for this message
shine (divine-shine) wrote :

please find the cloud-init logs attached.

I believe cloud-init-logs-2019-07-12/run/cloud-init/instance-data.json should have all the data that cloud-init query ds.meta_data should give, and more. So, I'm not pasting the data here again. I hope this should suffice to look into the issue.

What I'm saying here is to make ds.meta_data.instance_data a map rather than to encode it into a string and make it harder to query and use. I don't see how ds.meta_data.instance_data could ever be a string field. It should always be a map since it would have all the metadata that are passed to the instance at instantiation.

Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi shine,

Thanks for attaching the logs! I don't fully understand what it is you're asking for here. To help me understand, could you give a concrete example of something that you would like to be able to do that you can't today?

Thanks!

Dan

Revision history for this message
shine (divine-shine) wrote :

ds.meta_data is a dict of key-values that are populated by the cloud provider. It also has a key called instance_data within ds.meta_data which holds the custom metadata that is passed to the instance during initialization by the user.

All the values within ds.meta_data key are encoded to string. However, ds.meta_data.instance_data will always be a dict.

What I want is to query from the metadata values that are passed to the instance during initialization viz, ds.meta_data.instance_data.

In the instance_data that is within the attached logs, what I'd like to fetch is ds.meta_data.instance_data.env, ds.meta_data.instance_data.cookie, ds.meta_data.instance_data.rabbitmq_version, etc.

However, with the current version, the maximum depth I can reach is ds.meta_data.instance_data of which the value is a string; though it is really a dict encoded into a string.

What I'm asking for is to kind of change the data type for the value of the instance_data field to dict so that it is query-able.

Changed in cloud-init:
status: Incomplete → New
Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi shine,

I've tried updating the description of the bug to match my understanding. Could you review it and let me know if you're happy with the changes I've made?

Thanks!

Dan

summary: - instance-data : string value makes querying difficult
+ On GCE, ds.metadata.instance_data cannot easily be used for templating
summary: - On GCE, ds.metadata.instance_data cannot easily be used for templating
+ On GCE, ds.metadata.instance_data cannot be used for templating
description: updated
Changed in cloud-init:
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
shine (divine-shine) wrote :

The instance_data key does not contain GCE-specific metadata. It contains all the metadata that is user-provided during provisioning. I also don't think that the key is GCE-specific, though I'm not 100% sure. Let me check with AWS and see how cloudinit represents the metadata that is user-provided.

Revision history for this message
shine (divine-shine) wrote :

Dan,

You were correct in updating the description of the report. I did check with AWS and found that there were no fields for custom user-provided metadata that were passed into the instance. The instance_data field is GCE-specific unlike my initial assumption that it was a common field for all cloud providers.

I did notice that there were multiple fields in the AWS ds.meta_data that were represented as dicts, viz, block_device_mapping, network (nested dict), public_keys (nested dict) and services. So, it is not technically impossible to make the instance_data field from GCE a dict (or nested dict for that matter).

I'm not aware of how the code works for cloud-init. However, if this is not a major change (and if I can get some guidance), I can try and contribute the change as a pull request. I believe it should work with a method to parse the field in cloud-init/sources/DataSourceGCE.py

Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi shine,

Thanks for checking on that! We'd love for you to contribute a fix. https://cloudinit.readthedocs.io/en/latest/topics/hacking.html gives you a guide on how to get started with cloud-init development in general, so I would run through that first.

This isn't an area of cloud-init that I'm all that familiar with, but I would suggest that your intuition about it being in DataSourceGCE is correct, and these lines in read_md are probably what you need to be examining: https://github.com/cloud-init/cloud-init/blob/a02c0c9aa24a16f1983a81fe5dbfadac3d7e0ad3/cloudinit/sources/DataSourceGCE.py#L189-L229. It looks to me like `md` there has only string values. Don't take this as gospel though, I may be mistaken!

Thanks!

Dan

Changed in cloud-init:
assignee: nobody → shine (divine-shine)
Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.