Alert user of Ec2 Datasource on lookalike cloud

Bug #1660385 reported by Scott Moser on 2017-01-30
36
This bug affects 7 people
Affects Status Importance Assigned to Milestone
cloud-init
Medium
Scott Moser
cloud-init (Ubuntu)
Medium
Unassigned
Xenial
Medium
Unassigned
Yakkety
Medium
Unassigned

Bug Description

=== Begin SRU Template ===
[Impact]
Opportunistic polling of the Ec2 Metadata service, which lives at
169.254.169.254 can be problematic for numerous reasons including timeouts.

In this first phase of SRU, the code that has been added will be set to
a warn-only mode.

In 16.04, if cloud-init finds it is using a EC2 Metadata Service but
not running on Amazon AWS, it will warn the user.

In 16.10, it will warn the user and sleep 10 seconds to increase the
likelyhood of being noticed.

[Test Case]

a.) check warnings are seen on openstack configured to use ec2
 - launch instance on openstack (it will use OpenStack MD)
 - enable proposed upgrade
 - rm -Rf /var/lib/cloud /var/log/cloud-init*
 - dpkg-reconfigure cloud-init
   # select 'Ec2' and 'None' only
 - sudo reboot
 - ssh in. you should see a warning.
   The warning instructs you to silence the warning by putting
   the following in /etc/cloud/cloud.cfg.d/99-ec2-datasource.cfg. Do that.
    | datasource:
    | Ec2:
    | strict_id: false
 - rm -Rf /var/lib/cloud/ /var/log/cloud*
 - reboot
 - ssh in. you should not see a warning.

[Regression Potential]
There is real regression potential here. That is why we have announced
this fairly widely and also are putting this into place with warnings
only first.

After some time is passed, further SRUs will put more strict behavior
in place.

[Other Info]
We've announced this fairly widely on mailing lists
 https://lists.ubuntu.com/archives/ubuntu-devel/2017-February/039697.html
=== End SRU Template ===

Many cloud providers mimic the EC2 Metadata service [1] in order to
provide a level of EC2 compatibility for images. This is quite useful and
allows image portability.

Because this is a network based metadata service, cloud-init
opportunistically poll an IPv4 link local address (http://169.254.169.254)
to determine if there is metadata available. That can have negative side
affects such as timeouts.

AWS has recently begun providing a way for instances to determine if they
are running on EC2 [2].

Cloud-init will change its behavior to attempt to find the EC2 metadata
service only if it has determined itself to be running on EC2 or another
known cloud provider which provides an EC2 metadata service.

For more information, please see:
  https://lists.ubuntu.com/archives/ubuntu-devel/2017-February/039697.html

--
[1] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/identify_ec2_instances.html

Scott Moser (smoser) on 2017-01-30
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → Medium
status: Confirmed → In Progress
assignee: nobody → Scott Moser (smoser)
Scott Moser (smoser) on 2017-02-23
description: updated
Scott Moser (smoser) on 2017-03-03
tags: added: dsid
Changed in cloud-init:
status: In Progress → Fix Committed
Scott Moser (smoser) on 2017-03-03
Changed in cloud-init (Ubuntu):
status: New → Fix Released
importance: Undecided → Medium
Changed in cloud-init (Ubuntu Xenial):
status: New → Confirmed
Changed in cloud-init (Ubuntu Yakkety):
status: New → Confirmed
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → Medium
Changed in cloud-init (Ubuntu Yakkety):
importance: Undecided → Medium
Scott Moser (smoser) on 2017-03-03
description: updated
Scott Moser (smoser) on 2017-03-03
description: updated

Hello Scott, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.9-48-g1c795b9-0ubuntu1~16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed
Chris Halse Rogers (raof) wrote :

Hello Scott, or anyone else affected,

Accepted cloud-init into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.9-48-g1c795b9-0ubuntu1~16.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Yakkety):
status: Confirmed → Fix Committed
Scott Moser (smoser) wrote :

$ cat /etc/cloud/build.info
build_name: server
serial: 20170303.2

# system used OpenStack datasource
$ cat /var/lib/cloud/data/result.json
{
 "v1": {
  "datasource": "DataSourceOpenStack [net,ver=2]",
  "errors": []
 }
}

$ grep Looking /var/log/cloud-init.log
2017-03-08 22:32:07,094 - __init__.py[DEBUG]: Looking for for data source in: ['NoCloud', 'ConfigDrive', 'OpenNebula', 'DigitalOcean', 'Azure', 'AltCloud', 'OVF', 'MAAS', 'GCE', 'OpenStack', 'CloudSigma', 'SmartOS', 'Ec2', 'CloudStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM']
2017-03-08 22:32:09,183 - __init__.py[DEBUG]: Looking for for data source in: ['NoCloud', 'ConfigDrive', 'OpenNebula', 'DigitalOcean', 'Azure', 'AltCloud', 'OVF', 'MAAS', 'GCE', 'OpenStack', 'CloudSigma', 'SmartOS', 'Ec2', 'CloudStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM', 'NETWORK']

## upgrade
$ rel=$(lsb_release -sc)
$ line=$(awk '$1 == "deb" && $2 ~ /ubuntu.com/ { printf("%s %s %s-proposed main universe\n", $1, $2, rel); exit(0) }; ' "rel=$rel" /etc/apt/sources.list)
$ echo "$line" | sudo tee /etc/apt/sources.list.d/proposed.list
$ sudo apt-get update -q && sudo apt-get install -q cloud-init
$ dpkg-query --show cloud-init
cloud-init 0.7.9-48-g1c795b9-0ubuntu1~16.10.1

## interactive...
$ dpkg-reconfigure cloud-init

$ cat /etc/cloud/cloud.cfg.d/90_dpkg.cfg
# to update this file, run dpkg-reconfigure cloud-init
datasource_list: [ Ec2, None ]

$ sudo rm -Rf /var/lib/cloud /var/log/cloud-init* /run/cloud-init
$ sudo reboot

### then ssh back in, you see a warning
Looks like http://paste.ubuntu.com/24142108/

Write the file as instructed
$ cat /etc/cloud/cloud.cfg.d/99-ec2-datasource.cfg
#cloud-config
datasource:
 Ec2:
  strict_id: false

# clean up and reboot again
$ mkdir old2; sudo mv /run/cloud-init/ /var/log/cloud-init* /var/lib/cloud/ old2
$ sudo reboot

## Then go back in, no warning shown due to the config.
$ grep strict_ /var/log/cloud-init.log
2017-03-08 22:45:25,174 - DataSourceEc2.py[DEBUG]: strict_mode: false, cloud_platform=Unknown

Scott Moser (smoser) wrote :

$ cat /var/lib/cloud/data/result.json
{
 "v1": {
  "datasource": "DataSourceOpenStack [net,ver=2]",
  "errors": []
 }
}
$ cat /etc/cloud/build.info
build_name: server
serial: 20170303.1
$ lsb_release -sc
xenial

$ mkdir old; sudo mv /run/cloud-init/ /var/log/cloud-init* /var/lib/cloud/ old

$ grep Looking /var/log/cloud-init.log
2017-03-08 22:32:23,269 - __init__.py[DEBUG]: Looking for for data source in: ['NoCloud', 'ConfigDrive', 'OpenNebula', 'DigitalOcean', 'Azure', 'AltCloud', 'OVF', 'MAAS', 'GCE', 'OpenStack', 'CloudSigma', 'SmartOS', 'Ec2', 'CloudStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM']
2017-03-08 22:32:25,334 - __init__.py[DEBUG]: Looking for for data source in: ['NoCloud', 'ConfigDrive', 'OpenNebula', 'DigitalOcean', 'Azure', 'AltCloud', 'OVF', 'MAAS', 'GCE', 'OpenStack', 'CloudSigma', 'SmartOS', 'Ec2', 'CloudStack', 'None'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM', 'NETWORK']

$ rel=$(lsb_release -sc)
$ line=$(awk '$1 == "deb" && $2 ~ /ubuntu.com/ { printf("%s %s %s-proposed main universe\n", $1, $2, rel); exit(0) }; ' "rel=$rel" /etc/apt/sources.list)
$ echo "$line" | sudo tee /etc/apt/sources.list.d/proposed.list
$ sudo apt-get update -q && sudo apt-get install -q cloud-init
$ dpkg-query --show cloud-init
cloud-init 0.7.9-48-g1c795b9-0ubuntu1~16.04.1

$ sudo dpkg-reconfigure cloud-init
Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'
$ cat /etc/cloud/cloud.cfg.d/90_dpkg.cfg
# to update this file, run dpkg-reconfigure cloud-init
datasource_list: [ Ec2, None ]

$ sudo rm -Rf /var/lib/cloud /var/log/cloud-init* /run/cloud-init
$ sudo reboot

## See the warning on stderr, instructing how to silence
## set that file up.
$ cat /etc/cloud/cloud.cfg.d/99-ec2-datasource.cfg
#cloud-config
datasource:
 Ec2:
  strict_id: false
$ mkdir old2; sudo mv /run/cloud-init/ /var/log/cloud-init* /var/lib/cloud/ old2

## then ssh back in, no warning shown.
$ grep strict_ /var/log/cloud-init.log
2017-03-08 22:56:03,832 - DataSourceEc2.py[DEBUG]: strict_mode: false, cloud_platform=Unknown

tags: added: verification-done-xenial verification-done-yakkety
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-48-g1c795b9-0ubuntu1~16.04.1

---------------
cloud-init (0.7.9-48-g1c795b9-0ubuntu1~16.04.1) xenial-proposed; urgency=medium

  * debian/rules: install Z99-cloudinit-warnings.sh to /etc/profile.d
  * debian/patches/ds-identify-behavior-xenial.patch: adjust default
    behavior of ds-identify for SRU (LP: #1669675, #1660385).
  * New upstream snapshot.
    - Support warning if the used datasource is not in ds-identify's list
      (LP: #1669675).
    - DatasourceEc2: add warning message when not on AWS. (LP: #1660385)
    - Z99-cloudinit-warnings: Add profile.d script for showing warnings on
    - Z99-cloud-locale-test.sh: convert tabs to spaces, remove unneccesary
      execute bit in permissions.
    - (RedHat) net: correct errors in cloudinit/net/sysconfig.py
      [Lars Kellogg-Stedman]
    - ec2_utils: fix MetadataLeafDecoder that returned bytes on empty
    - Fix eni rendering of multiple IPs per interface [Ryan Harper]
      (LP: #1657940)
    - Add 3 ecdsa-sha2-nistp* ssh key types now that they are standardized
      [Lars Kellogg-Stedman]
    - EC2: Do not cache security credentials on disk [Andrew Jorgensen]
      (LP: #1638312)
    - OpenStack: Use timeout and retries from config in get_data.
      [Lars Kellogg-Stedman] (LP: #1657130)
    - Fixed Misc issues related to VMware customization. [Sankar Tanguturi]
    - (RedHat) Use dnf instead of yum when available [Lars Kellogg-Stedman]
    - Get early logging logged, including failures of cmdline url.
    - test / doc / build environment changes
      - Remove style checking during build and add latest style checks to
        tox [Joshua Powers]
      - code-style: make master pass pycodestyle (2.3.1) cleanly, currently
        [Joshua Powers]
      - Fix small typo and change iso-filename for consistency
      - tools/mock-meta: support python2 or python3 and ipv6 in both.
      - tests: remove executable bit on test_net, so it runs, and fix it.
      - tests: No longer monkey patch httpretty for python 3.4.2
      - reset httppretty for each test [Lars Kellogg-Stedman]
      - build: fix running Make on a branch with tags other than master
      - doc: Fix typos and clarify some aspects of the part-handler
        [Erik M. Bray]
      - doc: add some documentation on OpenStack datasource.
      - Fix minor docs typo: perserve > preserve [Jeremy Bicha]
      - validate-yaml: use python rather than explicitly python3

 -- Scott Moser <email address hidden> Mon, 06 Mar 2017 16:34:10 -0500

Changed in cloud-init (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.9-48-g1c795b9-0ubuntu1~16.10.1

---------------
cloud-init (0.7.9-48-g1c795b9-0ubuntu1~16.10.1) yakkety; urgency=medium

  * debian/rules: install Z99-cloudinit-warnings.sh to /etc/profile.d
  * debian/patches/ds-identify-behavior-yakkety.patch: adjust default
    behavior of ds-identify for SRU (LP: #1669675, #1660385).
  * New upstream snapshot.
    - Support warning if the used datasource is not in ds-identify's list
      (LP: #1669675).
    - DatasourceEc2: add warning message when not on AWS. (LP: #1660385)
    - Z99-cloudinit-warnings: Add profile.d script for showing warnings on
    - Z99-cloud-locale-test.sh: convert tabs to spaces, remove unneccesary
      execute bit in permissions.
    - (RedHat) net: correct errors in cloudinit/net/sysconfig.py
      [Lars Kellogg-Stedman]
    - ec2_utils: fix MetadataLeafDecoder that returned bytes on empty
    - Fix eni rendering of multiple IPs per interface [Ryan Harper]
      (LP: #1657940)
    - Add 3 ecdsa-sha2-nistp* ssh key types now that they are standardized
      [Lars Kellogg-Stedman]
    - EC2: Do not cache security credentials on disk [Andrew Jorgensen]
      (LP: #1638312)
    - OpenStack: Use timeout and retries from config in get_data.
      [Lars Kellogg-Stedman] (LP: #1657130)
    - Fixed Misc issues related to VMware customization. [Sankar Tanguturi]
    - (RedHat) Use dnf instead of yum when available [Lars Kellogg-Stedman]
    - Get early logging logged, including failures of cmdline url.
    - test / doc / build environment changes
      - Remove style checking during build and add latest style checks to
        tox [Joshua Powers]
      - code-style: make master pass pycodestyle (2.3.1) cleanly, currently
        [Joshua Powers]
      - Fix small typo and change iso-filename for consistency
      - tools/mock-meta: support python2 or python3 and ipv6 in both.
      - tests: remove executable bit on test_net, so it runs, and fix it.
      - tests: No longer monkey patch httpretty for python 3.4.2
      - reset httppretty for each test [Lars Kellogg-Stedman]
      - build: fix running Make on a branch with tags other than master
      - doc: Fix typos and clarify some aspects of the part-handler
        [Erik M. Bray]
      - doc: add some documentation on OpenStack datasource.
      - Fix minor docs typo: perserve > preserve [Jeremy Bicha]
      - validate-yaml: use python rather than explicitly python3

 -- Scott Moser <email address hidden> Mon, 06 Mar 2017 16:37:28 -0500

Changed in cloud-init (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Simon Leinen (simon-leinen) wrote :

This "bug fix" is a huge regression for me.

We run an OpenStack-based cloud and provide (among others) Ubuntu-based images on it, which we regularly refresh. Since the latest refresh, our users get these warnings on every login. The warnings are completely unwarranted, and I don't understand the reasoning behind them:

Like most OpenStack clouds, we do provide EC2 metadata over HTTP on the well-known link-local address 169.254.169.254. We also provide metadata in OpenStack's own format, also via 169.254.169.254. Instances can detect that they are running on OpenStack based on DMI values.

Now cloud-init is trying to be clever about which kind of cloud it is running on, and use suitable metadata depending on that. So far, so good!

On our cloud it should be able to detect (via DMI) that it is running on OpenStack. The reasoning explained in the bug description suggests that cloud-init only wants to use EC2 metadata when running on EC2. (I think it could also use EC2 metadata under OpenStack, but be that as it may.)

But shouldn't cloud-init then just use the OpenStack metadata service (which we also provide)? Complaining that the EC2 metadata service exists as well seems pointless to me.

Can someone explain the logic of the current behavior under OpenStack to me? Because I don't understand it.

Simon Leinen (simon-leinen) wrote :

Following up on my last comment: I now understand why cloud-init on our Ubuntu OpenStack images tries to use EC2 metadata rather than OpenStack metadata; the diskimage-builder (DIB) tool puts the following in /etc/cloud/cloud.cfg.d/91-dib-cloud-init-datasource.cfg:

  datasource_list: [ Ec2, None ]

When I add "OpenStack, " in front of that list and reboot, then the warning message disappears (but the datasource change also causes the SSH host key to change, which is a bit annoying).

I'll raise this with the upstream diskimage-builder project. I assume that they prefer the EC2 datastore for a reason, so I'll suggest to add the configuration with the "strict_id: false" attribute to the generated images.

Simon Leinen (simon-leinen) wrote :

Bug created at the diskimage-builder project: https://bugs.launchpad.net/diskimage-builder/+bug/1683038

For the record, I still find these warnings kind of pointless. Can you consider disabling them for OpenStack clouds? The OpenStack metadata service supports the Ec2 conventions, so it seems safe to assume that Ec2 works under OpenStack.

Florian Haas (fghaas) wrote :

Related: bug 1686538.

Vladimir Pouzanov (farcaller) wrote :

This is a pretty big issue for those of us not running inside any cloud environment, but still relying on non-intrusive ways to provision VMs.

I know that NoCloud is the expected datasource for that, but Ec2 was always a hassle-free option. It is trivial to spin up a Ec2-compatible metadata server, and then the only thing you need is to boot a vanilla ubuntu image with no strings attached, it will find ec2 metadata and will just work.

Compare this to NoCloud:

 - pass a kernel command line option, which requires to boot qemu with custom kernel outside of ubuntu guest image and is troublesome

or

- add config files into the guest's /var/lib/cloud/seed/nocloud, which requires mounting the image on the host and tinkering with it (and as soon as it's mounted, there's little reason to use cloud-init at all, TBH)

or

- add a virtual cd image with cloud metadata, which requires provisioning those images and some (at times) non-trivial bookkeeping, e.g. when you migrate a vm to a different host (two machines are still a NoCloud).

If you are going this route, I would highly suggest you to consider an option where NoCloud can be pointed at network server via, e.g. dmi strings, that are trivially configurable from qemu / libvirt.

Nick (n6ck) wrote :

This is also a pretty big issue for us, as we are also running a EC2 compatible metadata server for us. Currently we can suppress the warnings, but it looks like in Ubuntu 16.10 there will be even a timeout included. Is there any option which we can set to stay with the current behaviour (no timeout, no warning, just use EC2 metadata service)?

David Laube (dlaube-w) wrote :

We are in the same situation as quite a few others have mentioned here in the bug report already. To echo Nick (n6ck)'s comment above, we are also running a EC2 compatible metadata server here as well. Although we fully expect to add our own cloud-init datasource provider, it would be nice to know if the EC2 "strict_id: false" work-around will be here to stay for a while.

Scott Moser (smoser) wrote :

Wow, sorry for being delinquent in seeing these. Simon, Vladimir, Nick, and David, thank you for your input.

Cloud-init is making this change with the goal of improvement both in performance and in user experience (such as attempting http traffic to a firewalled-off 169.254.169.254 and timing out).

I think your concerns mainly fall into a few different buckets.
a.) (Simon) configuration of: datasource_list: [ Ec2, None ]
This should be working correctly now. If cloud-init sees only one entry (or
one entry and 'None') then it will just select that datasource and go on. If
it is not doing this, please file a bug and we'll get that fixed. Your
'reboot' issue is a result of the Ec2 and Openstack metadata services having a
different instance-id, so cloud-init sees this as a new instance.

b.) RDO provisioned Openstack nova systems were not correctly identifed.
This is fixed under bug 1675349. RDO provisioned nova systems changed
the product_name exposed through smbios to be 'OpenStack Compute' rather
than what upstream has ('OpenStack Nova').

c.) Non AWS desired use of EC2 metadata service.
This can be fixed in one of 2 ways
 1.) file a bug, tell me how to identify your platform, and we'll make
 cloud-init identify it as legitimate use of Ec2 metadata. I did this
 for brightbox at bug 1661693. Please collect some information on
 the system like an example qemu command line, and the output of
 sudo sh -c 'cd /sys/class/dmi/id && grep -r . *'

 Nick, David, this speaks directly to you. I'm perfectly willing to
 support your cloud, just tell me how to identify it.

 2.) change your launching to make the system look like it *is* in AWS.
 http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/identify_ec2_instances.html . My comment #2 in bug 1661693 shows how to do this in libvirt or qemu.

 3.) cloud-init offer another supported way to re-use EC2 metadata service.
 cloud-init will have to identify this *somehow* so you'll likely have
 to change your qemu/libvirt invocation, but

 Vladimir, your idea of NoCloud change here is a good one, and I'm open to
 that. I've opened bug 1691772 to address that.

Please feel free to reach out to me in Freenode IRC (smoser in #cloud-init).

Scott Moser (smoser) wrote :

This is believed fixed in 17.1

Changed in cloud-init:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers