OpenStack Heat

Update/delete hangs if previous update times out

Bug #1721654 reported by Zane Bitter on 2017-10-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Heat	Fix Released	High	Zane Bitter	OpenStack Heat queens-1

Bug Description

If convergence is enabled and the following sequence of events occurs:

1) The user initiates a stack update (or create), and one or more resources are taking a long time to complete.
2) The user initiates a second stack update before those resources are completed
3) Any of those resources eventually time out because they were still IN_PROGRESS when hitting the stack timeout from the original update.

then the *second* update will never complete, but hang IN_PROGRESS forever.

When the initial update releases the lock on the resource, it should retrigger the latest traversal if that resource is ready, but it does not in the case that it times out.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-05: Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/509921

Changed in heat:
status:	Triaged → In Progress

Zane Bitter (zaneb) on 2017-10-06

summary:

- Update/delete hangs if previous update times out or is cancelled
+ Update/delete hangs if previous update times out

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-18: Related fix proposed to heat (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/513181

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-23: Related fix merged to heat (master)

Reviewed: https://review.openstack.org/513181
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=6a9672a26443c901f4a465c86992ecece3f73bbd
Submitter: Zuul
Branch: master

commit 6a9672a26443c901f4a465c86992ecece3f73bbd
Author: Zane Bitter <email address hidden>
Date: Wed Oct 18 16:46:39 2017 -0400

Make scheduler.Timeout exception hashable

The python standard library in Python 3.6.3 and earlier has a bug with
handling unhashable exceptions: https://bugs.python.org/issue28603

Although oslo_log will catch the error, make scheduler.Timeout hashable so
that all exceptions will be printable.

    Prior to 640abe0c12e63c207fcf67592838f112a29f5b43 we just used __cmp__(),
    but that isn't used in Python 3. Defining __eq__(), which is required for
    the total_ordering decorator, makes the class unhashable in Python 3.

    Change-Id: Idde65b2d41490ab8318b5a8b95ea74e9b96b4e5c
    Related-Bug: #1724366
    Related-Bug: #1721654

Changed in heat:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-23: Fix merged to heat (master)

Reviewed: https://review.openstack.org/509921
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=bb330ae1a6c907ec2a8b8c198b7268d0abec3b43
Submitter: Zuul
Branch: master

commit bb330ae1a6c907ec2a8b8c198b7268d0abec3b43
Author: Zane Bitter <email address hidden>
Date: Wed Oct 18 16:46:39 2017 -0400

Retrigger new traversals after resource timeout

    If a resource times out, we still need to check whether there is a new
    traversal underway that we need to retrigger, otherwise the new traversal
    will never complete.

Change-Id: I4ac7ac88797b7fb14046b5668649b2277ee55517
Closes-Bug: #1721654

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-27: Fix included in openstack/heat 10.0.0.0b1

This issue was fixed in the openstack/heat 10.0.0.0b1 development milestone.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

python-roundup #28603
[2:3] Edit

Bug watches keep track of this bug in other bug trackers.