host keys never restored following metadata api outage

Bug #1553815 reported by Edward Hope-Morley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Undecided
Unassigned
cloud-init (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Confirmed
Undecided
Unassigned
Wily
Won't Fix
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

We are running an Openstack cloud and have noticed some unexpected behaviour in our Ubuntu Trusty cloud instances created by Nova.

We have observed that if a previously initialised instance (e.g. DataSourceOpenstack has already been run) is rebooted while the metadata api is not available (i.e. 169.254.169.264 is unreachable), cloud-init will retry a few times then switch to DataSourceNone and regenerate host keys.

    # Boot instance under normal conditions
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95
    ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log
    Generating public/private rsa key pair.

    # Stop neutron metadata api service and reboot instance (observing that host keys were regenerated)
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/iid-datasource-none
    ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log
    Generating public/private rsa key pair.
    Generating public/private rsa key pair.

So far so good since we expect this behaviour, but now we reboot this instance with the metadata api is once again reachable. Cloud-init rightly selects the original DataSourceOpenstack instance but it does nothing since it already ran once (and it is set to only run once). The problem here is that the original host keys are never
restored so any client connecting to that instance will have no option to accept the new host keys along with MITM attack warning.

    ubuntu@vm1:~$ sudo reboot
    ...
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95

Surely we could find a way for cloud-init to know that if if the current DataSourceOpenstack uuid matches its previously run uuid, then it can check that the host keys are consistent with the original run. @smoser suggested in a side discussion that dmidecode info could perhaps be used since the Openstack instance uuid can be found there:

    ubuntu@vm1:~$ sudo dmidecode -t system
    # dmidecode 2.12
    SMBIOS 2.8 present.

    Handle 0x0100, DMI type 1, 27 bytes
    System Information
     Manufacturer: OpenStack Foundation
     Product Name: OpenStack Nova
     Version: 13.0.0
     Serial Number: ba5f7371-fd4c-a25e-132f-3dd1e5b92e93
     UUID: CD535BC4-9C2F-4D31-8903-0EDE59C7EF95
     Wake-up Type: Power Switch
     SKU Number: Not Specified
     Family: Virtual Machine

    Handle 0x2000, DMI type 32, 11 bytes
    System Boot Information
     Status: No errors detected

If cloud-init kept a copy of previous host keys prior to regenerating them, it could presumably use this info to know when to safely restore the original host keys.

Since it is not inconceivable for the metadata api to become unreachable for a brief period (perhpas during an upgrade), i think we really need to make cloud-init more tolerant of this circumstance.

Tags: sts
tags: added: sts
Revision history for this message
Scott Moser (smoser) wrote :

fixed in trunk at revno 1188.

Changed in cloud-init:
status: New → Fix Committed
Revision history for this message
Scott Moser (smoser) wrote :

Just for reference, you can get the same behavior explicitly for any datasource by setting:
  manual_cache_clean: True

this does have to be in the image, but you just have to run something like this after system booted:

$ echo 'manual_cache_clean: True' >> /etc/cloud/cloud.cfg.d/99-manual-cache.cfg

Revision history for this message
Felipe Reyes (freyes) wrote :

For the record, this was fixed in revno 1188 (upstream)[0], available since cloud-init version 0.7.7~bzr1192-0ubuntu1[1]

[0] http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/revision/1188
[1] http://bazaar.launchpad.net/~smoser/ubuntu/xenial/cloud-init/pkg/revision/452

Felipe Reyes (freyes)
Changed in cloud-init (Ubuntu Xenial):
status: New → Fix Released
Scott Moser (smoser)
Changed in cloud-init (Ubuntu):
status: New → Fix Released
Changed in cloud-init (Ubuntu Trusty):
status: New → Confirmed
Changed in cloud-init (Ubuntu Wily):
status: New → Confirmed
status: Confirmed → Won't Fix
Revision history for this message
Scott Moser (smoser) wrote :

This is fixed in cloud-init 0.7.7.

Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.