Make WaitCondition work with server replacement

Bug #1695541 reported by Zane Bitter
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
Confirmed
Wishlist
Unassigned

Bug Description

If you have a WaitCondition resource to check that a server has booted before continuing, it does not run again if you replace the server. This is because it already has the previous signal(s), and it can't know that the server is being replaced (there's no direct data flow relationship between them).

The OS::Heat::UpdateWaitConditionHandle resource exists supposedly to solve this problem. However, it replaces the handle (which is where the data from incoming signals is stored) on *every* stack update. If the handle is referenced in the server's user_data, which would be typical, then replacement of the handle would also trigger replacement of the server - on every stack update.

The true solution to this problem is to user Heat SoftwareDeployments instead. But not all users are set up to do this.

An ideal alternative might be to add a 'server' property to OS::Heat::WaitCondition, and upon changes to that property start counting again from scratch. The user could then use get_resource to reference the server UUID there, with the added bonus that it creates the dependency relationship you want anyway without having to add a depends_on. There are two difficulties with this approach, however: one is that we'd have to implement ways of clearing the existing data for all of the different signal types that WaitConditionHandle supports. The other is that it doesn't really work well when waiting on multiple servers (which is supported via the 'count' property), if not all of them have been replaced - though this doesn't seem to be a common use case.

An easier solution would be to add a 'uniqueness_key' (or something) property to OS::Heat::WaitConditionHandle. If the value of the property ever changed, then the handle would be replaced. Note that this key *cannot* be a reference to the server, as that would create a circular dependency. However, it would allow the user to force a replacement of the handle (and hence the server, if it is referenced in the user_data) by e.g. modifying an input parameter value. On the other hand, it wouldn't handle other reasons for the server's replacement, such as it being in a CHECK_FAILED state.

Tags: spec-lite
Rico Lin (rico-lin)
Changed in heat:
milestone: none → pike-3
milestone: pike-3 → next
Revision history for this message
ioggstream (rpolli) wrote :

Adding `user_data_update_policy: IGNORE` to servers will avoid replacement, though it's not what people may generally want.

Thomas Herve (therve)
Changed in heat:
milestone: next → queens-1
Revision history for this message
Zane Bitter (zaneb) wrote :

A change to the user_data is only one of many reasons that a server resource may be replaced.

Rico Lin (rico-lin)
Changed in heat:
milestone: queens-1 → queens-2
Rico Lin (rico-lin)
Changed in heat:
status: New → Confirmed
Rico Lin (rico-lin)
Changed in heat:
milestone: queens-2 → queens-3
Rico Lin (rico-lin)
Changed in heat:
milestone: queens-3 → rocky-1
Zane Bitter (zaneb)
Changed in heat:
milestone: rocky-1 → next
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.