Convergence: resource is not completed when worker dies

Bug #1501161 reported by Anant Patil
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Anant Patil

Bug Description

When a worker provisioning a resource dies and another worker takes it up, it should run the check_{action}_complete for the in progress resource. Currently, just update is called on the exiting in progress resource and it will be no-op as the needs update will return as it will find no changes in resource definitions. This needs to be handled by again doing check_action_complete.

The change will be in addition to https://review.openstack.org/#/c/224402/.

Anant Patil (ananta)
Changed in heat:
assignee: nobody → Anant Patil (ananta)
Anant Patil (ananta)
tags: added: convergence-bugs
Angus Salkeld (asalkeld)
Changed in heat:
status: New → Confirmed
importance: Undecided → High
milestone: none → next
Revision history for this message
Anant Patil (ananta) wrote :

If a user issues an update on the stack which has either failed due to a failed worker, or is stuck waiting for resource to complete (from a failed worker). then heat should mark those resources as FAILED and continue to provision it. Fix is to mark the IN_PROGRESS resources from non-responding workers as FAILED, and start check_resource.

Since concurrent updates are allowed in heat with convergence, when a worker fails, user can issue an update on in progress stack with same template and this fix should complete the stack.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/264115

Changed in heat:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/264115
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=b84417b6cea85229f78a8b371ee6eb7e00e3411c
Submitter: Jenkins
Branch: master

commit b84417b6cea85229f78a8b371ee6eb7e00e3411c
Author: Anant Patil <email address hidden>
Date: Wed Jan 6 14:24:01 2016 +0530

    Convergence: Pick resource from dead engine worker

    When a engine worker crashes or is restarted, the resources being
    provisioned in it remain in IN_PROGRESS state. Next stack update should
    pick these resources and work on them. The implementation is to set the
    status of resource as FAILED and re-trigger check_resource.

    Change-Id: Ib7fd73eadd0127f8fae47881b59388b31131daf4
    Closes-Bug: #1501161

Changed in heat:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/heat 6.0.0.0b2

This issue was fixed in the openstack/heat 6.0.0.0b2 development milestone.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

This issue was fixed in the openstack/heat 6.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.