Ubuntu cloud-init expects trailing dot on GCE metadata FQDN

Bug #1581200 reported by Max Illfelder on 2016-05-12
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Medium
Philip Roche
Trusty
Medium
Philip Roche
Xenial
Medium
Scott Moser

Bug Description

[Impact]

 * If bind9 is installed and configured as a local DNS server on an ubuntu instance on GCE then on every reboot cloud-init will fail to retrieve instance metadata from GCE due to the lookup hostname not resolving.

 * Backporting of this is necessary as instances with bind9 installed can no longer take advantage of cloud-init

 * The patch fixes this bug by updating the hostname used in the metadata lookup to one that is included in /etc/hosts. As such it will resolve, even if bind9 hasn't started yet.

[Test Case]

#launch an instance of ubuntu 14.04 on GCE
sudo apt-get update
sudo apt-get install -y bind9
#Add the Google DNS servers as global forwarders and configure bind9 for the GCE environment
cat << EOF | sudo tee /etc/bind/named.conf.options
options {
    directory "/var/cache/bind";

    forwarders {
        169.254.169.254;
    };

    recursion yes;
    dnssec-validation no;
    dnssec-enable no;
    auth-nxdomain no;
    listen-on { 127.0.0.1; };
  };
EOF
sudo service bind9 restart
#setup your instance to use bind9 instead of the Google server
echo "supersede domain-name-servers 127.0.0.1;" | sudo tee -a /etc/dhcp/dhclient.conf
sudo dhclient -pf /run/dhclient.eth0.pid -x
sudo dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
if grep -q "nameserver 127.0.0.1" "/etc/resolv.conf"; then echo "resolv.conf has been updated"; fi
if host -t A metadata.google.internal | grep '169.254.169.254' > /dev/null; then echo "host lookup succeeded as expected"; fi
sudo service bind9 stop
if host -t A metadata.google.internal | grep 'connection timed out' > /dev/null; then echo "host lookup failed as expected"; fi

#Now reboot the instance
sudo reboot
#Once rebooted run the following
if grep -q "http://metadata.google.internal./computeMetadata/v1/ is not resolvable" "/var/log/cloud-init.log"; then echo "cloud-init failed to lookup metadata as expected"; else echo "cloud-init did _not_ fail to lookup metadata as expected"; fi

A patched ubuntu14.04 has been built. To test the patch run the above but after reboot run
#launch a patched instance
gcloud compute instances create ubuntu1404-patched-cloudinit --image daily-ubuntu-proche-cloudinit-1404-trusty-v20160627 --image-project ubuntu-os-cloud-devel

#on a patched instance run the following after reboot
if grep -q "http://metadata.google.internal/computeMetadata/v1/ is not resolvable" "/var/log/cloud-init.log"; then echo "cloud-init failed to retrieve metadata"; else echo "cloud-init did successfully retrieve metadata as expected"; fi

[Regression Potential]

 * GCE are questing this change.
 * The reported issue only affects GCE users and only a small set of those users will be using a local DNS server.
 * The change is a single character change and has been tested and as such has limited regression potential.

[Original Bug Report]

cloud-init hostname breaks because /etc/hosts does not have the trailing dot on metadata FQDN.

Background:
On Ubuntu, cloud-init sets the hostname using our metadata service. To do this, it hits "metadata.google.internal." (note trailing dot) via HTTP.

We have entries in /etc/hosts for the metadata service to ensure that we can access it at boot time (if DNS is not yet up) as we have other init scripts which block bootup when metadata cannot be reached. However, these /etc/hosts entries only have "metadata.google.internal" (no trailing dot) entries.

When a customer runs their own bind9 daemon, it starts *after* cloud-init, meaning that cloud-init must use /etc/hosts to find the metadata service. When it cannot, it incorrectly sets the hostname to "$hostname.localdomain" instead of just $hostname.

Proposed fix:
Update:
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/sources/DataSourceGCE.py

Line 28 should read:
'metadata_url': 'http://metadata.google.internal/computeMetadata/v1/'

Related branches

Dan Watkins (oddbloke) on 2016-05-16
summary: - Ubuntu cloud-init expects trailing dot on metadata FQDN
+ Ubuntu cloud-init expects trailing dot on GCE metadata FQDN
Changed in cloud-init (Ubuntu):
assignee: nobody → Dan Watkins (daniel-thewatkins)
Changed in cloud-init (Ubuntu):
assignee: Dan Watkins (daniel-thewatkins) → Philip Roche (philroche)
Scott Moser (smoser) on 2016-06-03
Changed in cloud-init (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Philip Roche (philroche) on 2016-06-13
Changed in cloud-init (Ubuntu):
status: Confirmed → Fix Committed
Scott Moser (smoser) wrote :

This commit was reverted in trunk as it broke tests.

Changed in cloud-init (Ubuntu):
status: Fix Committed → Confirmed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.7~bzr1242-0ubuntu1

---------------
cloud-init (0.7.7~bzr1242-0ubuntu1) yakkety; urgency=medium

  * d/control: Build-Depends on python3-unittest2
  * New upstream snapshot.
    - DataSourceNoCloud: fix stack trace on reboot, default to dsmode=net
      (LP: #1592505)
    - support network rendering to sysconfig (for centos and RHEL)
    - fix errors reported by pylint
    - move 'main' into cloudinit.cmd for easier testing. use
      setuptools entry_points for creating executable.
    - Remove trailing dot from GCE metadata URL (LP: #1581200)
    - Change missing Cheetah log warning to debug [Andrew Jorgensen]
    - make networking config provided in system config override datasource.
      (LP: #1590104)

 -- Scott Moser <email address hidden> Thu, 16 Jun 2016 00:07:12 -0400

Changed in cloud-init (Ubuntu):
status: Confirmed → Fix Released
Philip Roche (philroche) on 2016-06-22
Changed in cloud-init (Ubuntu Trusty):
assignee: nobody → Philip Roche (philroche)
Philip Roche (philroche) wrote :

Attached patch has been tested and fixes the reported issue.

description: updated
Scott Moser (smoser) wrote :

Hello,
An SRU upload of cloud-init for 16.04 that contains a fix for this bug has been made under bug 1595302. Please track that bug if you are interested.

Philip Roche (philroche) on 2016-06-30
Changed in cloud-init (Ubuntu Trusty):
status: New → In Progress
Mathew Hodson (mhodson) on 2016-07-03
Changed in cloud-init (Ubuntu Trusty):
importance: Undecided → Medium
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → Medium
Scott Moser (smoser) wrote :

fix is now released to xenial under bug 1595302. daily cloud-images with this newer version of cloud-init should appear in the next few days. Any image with a serial number *newer* than 20160707 should have cloud-init at 0.7.7~bzr1246-0ubuntu1~16.04.1 .

Changed in cloud-init (Ubuntu Xenial):
assignee: nobody → Scott Moser (smoser)
status: New → Fix Released
Philip Roche (philroche) wrote :

smoser has included the fix for this bug in cloud-init for 16.04 but the attached lp-1581200-gce-metadatafqdn.debdiff patch is still awaiting sponsorship for inclusion in cloud-init for 14.04.

Nish Aravamudan (nacc) wrote :

Based upon c#4, I believe this is fix-released.

Philip Roche (philroche) wrote :

Hi Nish,

lp:1595302 targets Xenial and Yakkety only. The fix needs to be backported to Trusty too. Patch attached to #3 is for Trusty.

Hello Max, or anyone else affected,

Accepted cloud-init into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.5-0ubuntu1.20 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Max Illfelder (illfelder) wrote :

I looked over the code and DataSourcesGCE.py still has the section:
BUILTIN_DS_CONFIG = {
    'metadata_url': 'http://metadata.google.internal./computeMetadata/v1/'
}

Where metadata.google.internal is suffixed with a trailing dot. I haven't run through the repro yet, but can you confirm that is intentional?

Philip Roche (philroche) wrote :

Hi Max, this is not intentional. I will investigate.

Philip Roche (philroche) wrote :

Hi Max,

When proposed pocket is enabled [1] then 0.7.5-0ubuntu1.20 [2] can be installed.

This does include the patch.

The cloud-init_0.7.5.orig.tar.gz source archive linked to on [3] is the source archive before patches are applied.

[1] https://wiki.ubuntu.com/Testing/EnableProposed
[2] http://archive.ubuntu.com/ubuntu/pool/main/c/cloud-init/cloud-init_0.7.5-0ubuntu1.20_all.deb
[3] https://launchpad.net/ubuntu/+source/cloud-init/0.7.5-0ubuntu1.20

Philip Roche (philroche) wrote :

I have tested the proposed package and it behaves as expected and fixes the reported issue.

tags: added: verification-done
removed: verification-needed
Philip Roche (philroche) wrote :

Apologies I forgot to include the version of cloud-init tested.

The version tested was 0.7.5-0ubuntu1.20

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.5-0ubuntu1.20

---------------
cloud-init (0.7.5-0ubuntu1.20) trusty; urgency=medium

  * GCE:
    - debian/patches/lp-1581200-gce-metadatafqdn.patch (LP: #1581200):
      Remove trailing dot in metadata.google.internal GCE metadata lookup.

 -- Phil Roche <email address hidden> Fri, 24 Jun 2016 11:43:42 +0100

Changed in cloud-init (Ubuntu Trusty):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers