Bug #1625073 “Convergence: worker fails to re-trigger new traver...” : Bugs : OpenStack Heat

Anant Patil (ananta) on 2016-09-19

Changed in heat:
assignee:	nobody → Anant Patil (ananta)

OpenStack Infra (hudson-openstack) on 2016-09-19

Changed in heat:
status:	New → In Progress

Zane Bitter (zaneb) on 2016-09-19

tags:	added: newton-rc-potential
Changed in heat:
importance:	Undecided → High
milestone:	none → newton-rc2

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-20: Fix merged to heat (master)

#1

Reviewed: https://review.openstack.org/371572
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=99b055b42357e2fae6006fe150c3c47c30dab1c0
Submitter: Jenkins
Branch: master

commit 99b055b42357e2fae6006fe150c3c47c30dab1c0
Author: Anant Patil <email address hidden>
Date: Fri Sep 16 14:13:57 2016 +0000

Re-trigger on update-replace

It is found that the inter-leaving of lock when a update-replace of a
resource is needed is the reason for new traversal not being triggered.

    Consider the order of events below:
    1. A server is being updated. The worker locks the server resource.
    2. A rollback is triggered because some one cancelled the stack.
    3. As part of rollback, new update using old template is started.
    4. The new update tries to take the lock but it has been already
    acquired in (1). The new update now expects that the when the old
    resource is done, it will re-trigger the new traversal.
    5. The old update decides to create a new resource for replacement. The
    replacement resource is initiated for creation, a check_resource RPC
    call is made for new resource.
    6. A worker, possibly in another engine, receives the call and then it
    bails out when it finds that there is a new traversal initiated (from
    2). Now, there is no progress from here because it is expected (from 4)
    that there will be a re-trigger when the old resource is done.

    This change takes care of re-triggering the new traversal from worker
    when it finds that there is a new traversal and an update-replace. Note
    that this issue will not be seen when there is no update-replace
    because the old resource will finish (either fail or complete) and in
    the same thread it will find the new traversal and trigger it.

Closes-Bug: #1625073
Change-Id: Icea5ba498ef8ca45cd85a9721937da2f4ac304e0

Changed in heat:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-21: Fix proposed to heat (stable/newton)

#2

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/373614

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-21: Fix merged to heat (stable/newton)

#3

Reviewed: https://review.openstack.org/373614
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=c6bc3fef71a9c46a4109d6abe4d4d4923c7bdae9
Submitter: Jenkins
Branch: stable/newton

commit c6bc3fef71a9c46a4109d6abe4d4d4923c7bdae9
Author: Anant Patil <email address hidden>
Date: Fri Sep 16 14:13:57 2016 +0000

Re-trigger on update-replace

It is found that the inter-leaving of lock when a update-replace of a
resource is needed is the reason for new traversal not being triggered.

    Consider the order of events below:
    1. A server is being updated. The worker locks the server resource.
    2. A rollback is triggered because some one cancelled the stack.
    3. As part of rollback, new update using old template is started.
    4. The new update tries to take the lock but it has been already
    acquired in (1). The new update now expects that the when the old
    resource is done, it will re-trigger the new traversal.
    5. The old update decides to create a new resource for replacement. The
    replacement resource is initiated for creation, a check_resource RPC
    call is made for new resource.
    6. A worker, possibly in another engine, receives the call and then it
    bails out when it finds that there is a new traversal initiated (from
    2). Now, there is no progress from here because it is expected (from 4)
    that there will be a re-trigger when the old resource is done.

    This change takes care of re-triggering the new traversal from worker
    when it finds that there is a new traversal and an update-replace. Note
    that this issue will not be seen when there is no update-replace
    because the old resource will finish (either fail or complete) and in
    the same thread it will find the new traversal and trigger it.

    Closes-Bug: #1625073
    Change-Id: Icea5ba498ef8ca45cd85a9721937da2f4ac304e0
    (cherry picked from commit 99b055b42357e2fae6006fe150c3c47c30dab1c0)

tags:

added: in-stable-newton

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-27: Fix included in openstack/heat 7.0.0.0rc2

#4

This issue was fixed in the openstack/heat 7.0.0.0rc2 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-11-10: Fix included in openstack/heat 7.0.0

#5

This issue was fixed in the openstack/heat 7.0.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-11-17: Fix included in openstack/heat 8.0.0.0b1

#6

This issue was fixed in the openstack/heat 8.0.0.0b1 development milestone.

OpenStack Heat

Convergence: worker fails to re-trigger new traversal on update-replace

Bug Description

Other bug subscribers

Remote bug watches