OpenStack detection broken on VMware

Bug #1788487 reported by Mark
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Medium
Unassigned
cloud-init (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

The OpenStack data source in cloud-init 18.3 was updated to detect OpenStack systems before probing the metadata endpoint. This broke for several cloud platforms (e.g. LP: #1784685) and is also broken for VMware.

As can be seen in LP: #1669875 there is no way to discern OpenStack on VMware from DMI data. A way around this could be to generally add 'VMware Virtual Platform' to the list of valid DMI platform names (`VALID_DMI_PRODUCT_NAMES`). This should still avoid the OpenStack metadata probe on most (if not all) other cloud platforms, as was the original goal in
https://git.launchpad.net/cloud-init/commit/?id=1efa8a0a030794cec68197100f31a856d0d264ab.

Mark (praseodym)
description: updated
description: updated
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

You should be able to specifically set the datasource in /etc/cloud/cloud.cfg.d/*.cfg
with:
 datasource_list: ['OpenStack', 'None']

I know this is not ideal, but in lieu of some mechanism for detecting that this platform provides an openstack metadata service on http://169.254.169.254/openstack it is currently the best option.

You mention "several cloud platforms" but list only vmware.

Can you give more details on other platforms?

Please also attach the output of 'cloud-init collect-logs' on any platform you see failure.

Thanks.

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Mark (praseodym) wrote :

Let me try to clarify the original issue. Commit https://git.launchpad.net/cloud-init/commit/?id=1efa8a0a030794cec68197100f31a856d0d264ab introduced a mechanism where, for performance reasons, the OpenStack metadata service is never queried if the host platform does not _look like_ OpenStack from a DMI information check. This check happens even when `datasource_list` explicitly contains `OpenStack`.

OpenStack exists in many forms, both on-premises [1, 2] and at hosting providers [3, 4]. Every one of those platforms is slightly different, for example in choice of hypervisor. The cloud-init DMI information check only takes into account a few different DMI fields and values. This means that it is very likely to fail to identify one of those different platforms as OpenStack, even though most (if not all) expose the OpenStack metadata service. LP #1784685 (already fixed) provides one such example of a platform (Oracle Cloud) that was not correctly detected.

VMware Integrated OpenStack (VIO) [5] is another platform that is currently not identified as an OpenStack platform. Because the DMI check returns false, the metadata service is never queried and cloud-init fails to retrieve any data. Unfortunately it is not possible to discern a VIO VM from a regular VMware VM through DMI information, as can be seen from the paste in LP: #1669875. I therefore propose to extend the DMI check by including `VMware Virtual Platform` in the `VALID_DMI_PRODUCT_NAMES` variable.

I do not currently have access to other OpenStack (cloud) platforms to verify whether they are being detected by cloud-init, so the scope of this bug is limited to OpenStack on VMware.

[1] https://www.openstack.org/marketplace/distros/
[2] https://www.openstack.org/marketplace/remotely-managed-private-clouds/
[3] https://www.openstack.org/marketplace/hosted-private-clouds/
[4] https://www.openstack.org/marketplace/public-clouds/
[5] https://www.vmware.com/products/openstack.html

Revision history for this message
Scott Moser (smoser) wrote :

@Mark,

We believe the decision to limit cloud-init's "poll"ing of network based
metadata services is the right decision. That means that cloud-init will
only reach out to a metadata service if:
 a.) it has positive identification that the metadata service will exist.
 b.) it has been specifically told (configured) to do so.

I gave an example of how to do 'b' in my comment above.

The suggestion of adding 'VMWare Virtual Platform' to the list of
VALID_DMI_PRODUCT_NAMES is unfortunately not a complete solution.
Running on VMWare does not indicate that an OpenStack metadata service
will be present. The system could be running in another cloud platform
(CloudStack or OpenNebula or even AWS at this point). At best that
provides a "maybe". In those other scenarios, attempting to reach
http://169.254.169.254/openstack may have negative side effects.

We need a way that we can positively identify that we are running in
OpenStack. That is why I opened the bug you referenced. This is really
just good policy. Platforms *should* identify themselves to software
that is running on them.

Thoughts?
Scott

Revision history for this message
Mark (praseodym) wrote :

From our experiments, doing 'b' is currently not possible. The first thing the OpenStack datasource does is check the DMI data [1] and if it does not match an expected platform, the metadata service is never probed. This leaves cloud-init without any valid data source.

Ironically, the EC2 datasource works in a non-strict mode by default, which means that it _will_ probe the metadata service. Because OpenStack provides a EC2-compatible endpoint, it will load data from the metadata service successfully. It however does print a warning about OpenStack (on VMware) being an unrecognised platform.

As for identifying OpenStack platforms: I think this is an impossible task with the large variety of distros out there, many of them built on older OpenStack releases. If platform identification were to be made a requirement in the next OpenStack release, it will take years before every distro and deployment out there is updated to conform to this (if ever). The next best thing I can come up with is adding a non-strict mode like in the EC2 datasource (so that metadata is always probed), or always probing the metadata service and removing OpenStack from the default list of datasources.

[1] https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceOpenStack.py?id=f03dfdebfe700c038a90452ecc23ed8840dea7d4#n124

Revision history for this message
Scott Moser (smoser) wrote :

@Mark,

We did not intend to break your use case on 16.04.
Upstream has required strict identification of the platform for some time.
(bug 1669875 was filed ~ 18 months ago).

We do intend to support "explicit" datasource configuration via
setting datasource_list to have 1 entry (or 2 entries if the second is
'None'). I'll look to fix that for OpenStack here. bug 1787459 currently
blocks doing that correctly, but we'll get that fixed as well.

> As for identifying OpenStack platforms: I think this is an impossible
> task with the large variety of distros out there, many of them built on
> older OpenStack releases. If platform identification were to be made a
> requirement in the next OpenStack release, it will take years before
> every distro and deployment out there is updated to conform to this (if
> ever).

Older OpenStack platforms will simply need some image customization
to get Ubuntu releases until they are able to identify themselves. Lets
clarify that 'older' here means older than Grizzly which went end of
life 2014-03-29 / over 4 years ago (at least for libvirt platforms).
I realize that other platforms support differ, but the point is that
this is not by any means a new feature of OpenStack.

> The next best thing I can come up with is adding a non-strict
> mode like in the EC2 datasource (so that metadata is always probed), or
> always probing the metadata service and removing OpenStack from the
> default list of datasources.

Do you think it is sufficient to just allow explicit customization of
the datasource? Ie:
 echo "datasource_list: [OpenStack, None]" > /etc/cloud/cloud.cfg.d/99.cfg

?

Changed in cloud-init:
status: Incomplete → Confirmed
importance: Undecided → Medium
Changed in cloud-init (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Ben Raymond (benray12) wrote :

I appear to be running into the same issue described by Mark. Platform is OpenStack using VMware (VC + ESXi). I first noticed this after upgrading a template from Ubuntu 16.04.2 to 16.04.5 which bumped cloud-init from 17.2-35 to 18.4-0. I got the EC2 warning as well :)

I'll be happy to check the explicit option if that currently works. I did try with only three (including ConfigDrive in case we wanted to mount an ISO on occasion) but that did not work.

If https://bugs.launchpad.net/cloud-init/+bug/1669875 is fixed would cloud-init be able to identify VMware instance and avoid explicit datasource?

thanks!
Ben

Revision history for this message
Ben Raymond (benray12) wrote :

@Scott,

DataSource OpenStack is still being skipped with cloud-init 18.4-0ubuntu1~16.04.2 even when explicitly setting:

datasource_list: [OpenStack, None]

Shouldn't it be probing for OpenStack metadata with this setting?

Thanks for any help. Generally I would prefer this to using an old cloud-init build.

Revision history for this message
Ben Raymond (benray12) wrote :

FYI still seeing this on 18.5-21-g8ee294d5-0ubuntu1~16.04.1. Does anyone have this working with just OpenStack and None datasources?

grep -i openstack /var/log/cloud-init*
/var/log/cloud-init.log:2019-03-12 16:48:22,086 - __init__.py[DEBUG]: Looking for data source in: ['OpenStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM']
/var/log/cloud-init.log:2019-03-12 16:48:22,093 - __init__.py[DEBUG]: Searching for local data source in: ['DataSourceOpenStackLocal']
/var/log/cloud-init.log:2019-03-12 16:48:22,094 - handlers.py[DEBUG]: start: init-local/search-OpenStackLocal: searching for local data from DataSourceOpenStackLocal
/var/log/cloud-init.log:2019-03-12 16:48:22,094 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceOpenStack.DataSourceOpenStackLocal'>
/var/log/cloud-init.log:2019-03-12 16:48:22,114 - __init__.py[DEBUG]: Datasource DataSourceOpenStackLocal [net,ver=None] not updated for events: New instance first boot
/var/log/cloud-init.log:2019-03-12 16:48:22,114 - handlers.py[DEBUG]: finish: init-local/search-OpenStackLocal: SUCCESS: no local data found from DataSourceOpenStackLocal
/var/log/cloud-init.log:2019-03-12 16:48:23,478 - __init__.py[DEBUG]: Looking for data source in: ['OpenStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM', 'NETWORK']
/var/log/cloud-init.log:2019-03-12 16:48:23,483 - __init__.py[DEBUG]: Searching for network data source in: ['DataSourceOpenStack', 'DataSourceNone']
/var/log/cloud-init.log:2019-03-12 16:48:23,483 - handlers.py[DEBUG]: start: init-network/search-OpenStack: searching for network data from DataSourceOpenStack
/var/log/cloud-init.log:2019-03-12 16:48:23,483 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceOpenStack.DataSourceOpenStack'>
/var/log/cloud-init.log:2019-03-12 16:48:23,505 - __init__.py[DEBUG]: Datasource DataSourceOpenStack [net,ver=None] not updated for events: New instance first boot
/var/log/cloud-init.log:2019-03-12 16:48:23,505 - handlers.py[DEBUG]: finish: init-network/search-OpenStack: SUCCESS: no network data found from DataSourceOpenStack
/var/log/cloud-init.log:2019-03-12 16:48:23,682 - main.py[DEBUG]: used datasource 'DataSourceNone' from 'None' was in di_report's list: ['OpenStack', 'None']

and ...

curl http://169.254.169.254/openstack/
2012-08-10
2013-04-04
2013-10-17

Revision history for this message
Mark T. Voelker (mvoelker) wrote :

Is this not the same issue worked around in the fix for https://bugs.launchpad.net/cloud-init/+bug/1669875 ?

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.