Node stop or node delete operations stay in WAITING_LIFECYCLE_COMPLETION after engine restart

Bug #1811294 reported by Duc Truong
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
senlin
Fix Released
Undecided
Duc Truong

Bug Description

If a cluster with a deletion policy that has lifecycle hooks defined is scaled-in, it spawns the node stop or node delete action in WAITING_LIFECYCLE_COMPLETION state. If the engine is restarted with an action in that state, it will not get properly cleaned up during the dead engine garbage collection. This is because the node stop / node delete action created in WAITING_LIFECYCLE_COMPLETION does not have an owner assigned to it. Therefore the dead engine garbage collection logic cannot find it and will not set those actions to FAILED.

Duc Truong (dtruong)
Changed in senlin:
status: New → In Progress
assignee: nobody → Duc Truong (dtruong)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to senlin (master)

Reviewed: https://review.openstack.org/629947
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=719c1bc32e6cdd0efe9e9c676f286c27cef0178b
Submitter: Zuul
Branch: master

commit 719c1bc32e6cdd0efe9e9c676f286c27cef0178b
Author: Duc Truong <email address hidden>
Date: Thu Jan 10 23:04:44 2019 +0000

    Set owner for actions in waiting for lifecycle

    - Set owner for that node stop or node delete actions that are created
      as part of cluster scale-in. That way those actions will be cleaned
      up by garbage collection if the engine dies.
    - Reset owner back to None if the node stop or node delete actions fail
      or the lifecycle timeout is hit. This is necessary for those actions
      to be get picked to be executed by dispatcher.

    Change-Id: I004386b069597af62da06fa88659babe91197229
    Closes-Bug: #1811294

Changed in senlin:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/senlin 7.0.0.0b1

This issue was fixed in the openstack/senlin 7.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.