OpenStack Compute (nova)

Instance system metadata is sometimes overwritten by image metadata

Bug #1460079 reported by Daniel Berrange on 2015-05-29

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Medium	Daniel Berrange	OpenStack Compute (nova) 12.0.0 "liberty"

Bug Description

When an instance is first created, a copy of the metadata from the image/volume is saved into the instance system metadata. This provides an accurate point in time record of the metadata used to configure & operate instance, even if the metadata on the image is later changed.

For a long while though, much of the code would not use this instance system metadata, instead just fetching metadata from the image each time. This had an obvious problem in that if the image was deleted, those operations would not be able to get image metadata, even though it was recorded for posterity in the system metadata.

So 2 commits were made to update Nova to fetch system metadata

commit 8e575be75c80ea71a6ad8fb73e6ace1ed708938f
Author: Xavier Queralt <email address hidden>
Date: Mon Aug 26 22:53:03 2013 +0200

Add methods to get image metadata from instance

    This patch adds a couple of utility functions that enclose all the logic
    for getting and parsing the image metadata stored in the instance's
    system metadata.

    First, this will try to fetch the metadata from the real image and will
    prevent it from failing if it is not available. It will be then merged
    with the image metadata stored during the instance creation.

Related to bug #1039662

Change-Id: I2130caf19858585571b1199e27f0a98ad5f08701

commit 4389f2292a0177c8eedc0a398ceb3c5535a9ef82
Author: Xavier Queralt <email address hidden>
Date: Mon Aug 26 22:55:46 2013 +0200

Avoid errors on some actions when image not usable

Using the metadata saved on instance creation, we can now get all the
image related metadata we need from the instance itself.

    This patch replace the logic for getting the image metadata on some
    actions that shouldn't fail when the image is not accessible (create
    an snapshot, resize, migrate, rescue an instance or attach an
    interface).

Fixes bug 1039662

Unfortunately the way the compute utils get_image_metadata method was designed, it first fetches the instance system metadata and then fetches the current metadata from the image (if it still exists). The system metadata fields are overwritten by those from the image.

So, there remains a problem that many operations are going to be performing actions based on the metadata currently associated with the image, and not that associated with the instance.

By good luck, this does not currently have too many serious ill effects, but with ever increasing use of image metadata for tuning instance hardware configuration this is becoming a more pressing problem.

For example, if the hw_disk_bus=virtio when the instance was first booted, and then the image was later changed to use hw_disk_bus=scsi, then logic which hotplugs disks may mistakenly end up attempting to hotplug a SCSI disk instead of a virtio disk which the instance was initially booted with.

The only code which should look at the image properties should be the initial boot operation. There after all code should be using the recorded instance system metadata, so it is making decisions that are consistent with those made when the instance was first booted.

There is an exception for the rescue operation, which by its very nature should be using the metadata from the rescue image, not the original instance system meta, since the hardware configuration needs to match that of the rescue image requirements.

Tags:

Markus Zoeller (markus_z) (mzoeller) on 2015-06-01

tags:

added: compute

Daniel Berrange (berrange) on 2015-06-01

Changed in nova:
assignee:	nobody → Daniel Berrange (berrange)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-01: Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/187251

Changed in nova:
status:	New → In Progress

Jay Pipes (jaypipes) on 2015-06-05

Changed in nova:
importance:	Undecided → Medium

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-09: Fix merged to nova (master)

Reviewed: https://review.openstack.org/187251
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=77ecc3d8c5853f5498bc6ed2d22d6ff0dd75a075
Submitter: Jenkins
Branch: master

commit 77ecc3d8c5853f5498bc6ed2d22d6ff0dd75a075
Author: Daniel P. Berrange <email address hidden>
Date: Mon Jun 1 16:18:30 2015 +0100

compute: remove get_image_metadata method

    The get_image_metadata method has some unhelpful semantics
    where it takes the image metadata from the instance's
    system metadata record, and then overwrites it with the
    current metadata associated with the original image.

    The result is that, if the image metadata in glance was
    changed after the instance was first booted, Nova will
    end up making decisions based on image metadata that does
    not correspond to that which the instance was booted with.
    Since the image metadata controls many aspects of hardware
    configuration, this could lead to incorrect behaviour when
    modifying hardware later, eg disk/vif/pci hotplug.

    What is worse, is that some of the nova operations will
    update the instance system metadata when completed, thus
    permanently overwriting the original image metadata with
    the new data.

    Almost all code which operates against an existing instance
    is updated to use nova.utils.get_image_from_system_metadata.
    The exception is the rescue codepath, which must use the
    metadata associated with the new rescue image.

The nova.compute.utils.get_image_metadata method can thus
be removed from use, avoiding the problematic logic.

Closes-bug: #1460079
Change-Id: I35c9d26e3967e93a2c368c3f9fdc807a69816dd2

Changed in nova:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2015-06-24

Changed in nova:
milestone:	none → liberty-1
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2015-10-15

Changed in nova:
milestone:	liberty-1 → 12.0.0

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.