Overzealous validation of images in empty ResourceGroups

Bug #1401929 reported by Tomas Sedovic on 2014-12-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Steven Hardy
Juno
Fix Released
Undecided
Unassigned
tripleo
High
Tomas Sedovic

Bug Description

To reproduce this, save both attached templates and do `heat stack-create -f template.yaml`.

It will error with:

ERROR: Failed to validate: Property error : myserv: image Error validating value u'unknown-image': The Image (unknown-image) could not be found.

Prior to this commit, these templates would create a stack:

https://github.com/openstack/heat/commit/c31c34f8dfd0919bf46a975701c139073115debc

This is a bit of a special case, because the validation runs on resources that will not be created at the time of stack-creation (we're creating 0 servers so it doesn't matter that the image isn't there).

This is breaking TripleO, but I am sympathetic towards both ways to think about this (validate everything to catch things quickly vs. don't validate anything that isn't actually going to be created).

I think is conceptually similar to conditionals (if/when we end up having them) so the solution there and here should be the same whatever it ends up being.

Tomas Sedovic (tsedovic) wrote :
Tomas Sedovic (tsedovic) wrote :
Steven Hardy (shardy) wrote :

The reason is the glance.image custom constraint here:

https://github.com/openstack/heat/blob/master/heat/engine/resources/server.py#L107

Probably the easiest quick fix is to skip the validation when count=0 - the behaviour introduced in the commit referenced as breaking this above is still desirable, as we'd want to fail fast on update in the event count > 0 is specified with some invalid data in the nested template.

Changed in heat:
assignee: nobody → Steven Hardy (shardy)
status: New → Confirmed
importance: Undecided → High
milestone: none → kilo-1
Tomas Sedovic (tsedovic) wrote :

That is what I would expect: running the nested stack validations is absolutely useful, but I'd like to see them skipped when the scaling group count is zero.

Fix proposed to branch: master
Review: https://review.openstack.org/141444

Changed in heat:
status: Confirmed → In Progress
James Polley (tchaypo) on 2014-12-13
Changed in tripleo:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Tomas Sedovic (tsedovic)
Clint Byrum (clint-fewbar) wrote :

IMO we should be dropping the glance image constraint. It is a massive overreach.

Steven Hardy (shardy) wrote :

> IMO we should be dropping the glance image constraint. It is a massive overreach.

Do you mean inside the resource schema? If so then I tend to agree - we should allow template authors to express those constraints on their input parameters, but having dynamic validation paths like this inside statically validated resource schema interfaces does seem like a step too far.

The patch I posted solves the issue for the specific case of count=0 for ResourceGroup, but wider discussion is probably needed before we remove all resource schema custom contraints.

Steven Hardy (shardy) wrote :

Another reason for removing the per-resource custom constraints is probably performance - e.g if you create a 1000 servers in a ResourceGroup we make at least 1000 calls to glance validating the same thing, vs (hopefully) 1 if you just validate the parameter via an explicitly defined custom contraint (or not, if you prefer not to fail fast during validation for whatever reason).

Angus Salkeld (asalkeld) on 2014-12-18
Changed in heat:
milestone: kilo-1 → kilo-2

Reviewed: https://review.openstack.org/141444
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=907c0aac79a367d84bdbc26acbf5c4475f9abab7
Submitter: Jenkins
Branch: master

commit 907c0aac79a367d84bdbc26acbf5c4475f9abab7
Author: Steven Hardy <email address hidden>
Date: Fri Dec 12 17:49:10 2014 +0000

    Disable nested validation for ResourceGroup with zero count

    Some users (TripleO specifically) want to disable features via a
    count of zero, which is a problem as we always recurse and validate
    the nested stack since c31c34f8dfd0919bf46a975701c139073115debc

    Instead, we only do validation when the count is non-zero, to
    enable, e.g default image names, to be ignored at validation
    time (as we'll never use them) instead of rejected by the nested
    schema (e.g the server.py properties schema which contains a
    custom contraint to always validate the image).

    This should still allow us to fail fast (at validation time before
    creating anything) when we're actually about to create something,
    e.g when the count is non-zero.

    Change-Id: I411ff41a9e0730e9864f5ed4ac54f1d5d0ec02d7
    Closes-Bug: #1401929

Changed in heat:
status: In Progress → Fix Committed
Zane Bitter (zaneb) on 2015-01-08
tags: added: juno-backport-potential
Zane Bitter (zaneb) wrote :

Note, the fix for this caused a regression, Bug #1404399. We need to backport *both* patches.

Clint Byrum (clint-fewbar) wrote :

I mean that it is a big overreach to force image IDs to exist at template validation time. I've said this before, but a template may actually _create_ the images for later servers, and currently we don't support that.

I do like the idea of only validating parameters quite a bit more, as a user can choose not to use the "image exists" validator and it will happily start deploying expecting that some later process will create it.

Reviewed: https://review.openstack.org/145917
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=94dec9e286a5d881bfaa18083d2406d8b18df784
Submitter: Jenkins
Branch: stable/juno

commit 94dec9e286a5d881bfaa18083d2406d8b18df784
Author: Steven Hardy <email address hidden>
Date: Fri Dec 12 17:49:10 2014 +0000

    Disable nested validation for ResourceGroup with zero count

    Some users (TripleO specifically) want to disable features via a
    count of zero, which is a problem as we always recurse and validate
    the nested stack since c31c34f8dfd0919bf46a975701c139073115debc

    Instead, we only do validation when the count is non-zero, to
    enable, e.g default image names, to be ignored at validation
    time (as we'll never use them) instead of rejected by the nested
    schema (e.g the server.py properties schema which contains a
    custom contraint to always validate the image).

    This should still allow us to fail fast (at validation time before
    creating anything) when we're actually about to create something,
    e.g when the count is non-zero.

    This patch also rolls in the fix for the regression this originally
    caused on master, Iea454f4201990f65633b644bbefe708e5a216885.

    Change-Id: I411ff41a9e0730e9864f5ed4ac54f1d5d0ec02d7
    Closes-Bug: #1401929
    Closes-Bug: #1404399
    (cherry picked from commit 907c0aac79a367d84bdbc26acbf5c4475f9abab7 and 4834443353fab57315040c9de2c7dac19e8a8b51)

Thierry Carrez (ttx) on 2015-02-05
Changed in heat:
status: Fix Committed → Fix Released
Steven Hardy (shardy) wrote :

Can the TripleO part of this be closed now that the heat patches landed?

Tomas Sedovic (tsedovic) on 2015-03-25
Changed in tripleo:
status: Confirmed → Fix Committed
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-04-30
Changed in heat:
milestone: kilo-2 → 2015.1.0
Zane Bitter (zaneb) on 2015-10-13
tags: added: in-stable-juno
removed: juno-backport-potential
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers