detach health policy will not truly disable it in mult-engine environment

Bug #1707871 reported by RUIJIE YUAN
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
senlin
Fix Released
High
RUIJIE YUAN

Bug Description

When using pull model in health policy, we will select one engine to process it. We will record the engine id into DB and runtime data to that engine(health manager actually).

However, when we want to unregister the cluster, we dispatch this request to a NONE engine, which will be processed by one of the engine randomly, it will delete the DB record, but will not disable or unregister the cluster since we are not guaranteed to remove the runtime data from the right engine...

https://git.openstack.org/cgit/openstack/senlin/tree/senlin/policies/health_policy.py#n213
https://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/health_manager.py#n499
https://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/health_manager.py#n437

RUIJIE YUAN (cnjie0616)
Changed in senlin:
assignee: nobody → RUIJIE YUAN (cnjie0616)
Qiming Teng (tengqim)
Changed in senlin:
status: New → Triaged
importance: Undecided → High
Revision history for this message
RUIJIE YUAN (cnjie0616) wrote :

restart engine will reload the tasks. Just need to figure out how to dispatch the "unregister_cluster" to the right engine.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to senlin (master)

Fix proposed to branch: master
Review: https://review.openstack.org/490789

Changed in senlin:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/490818

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to senlin (master)

Reviewed: https://review.openstack.org/490789
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=264a869f0d66bd8d043747e639ceca95a540d41c
Submitter: Jenkins
Branch: master

commit 264a869f0d66bd8d043747e639ceca95a540d41c
Author: ruijie <email address hidden>
Date: Fri Aug 4 01:10:21 2017 -0700

    add DB api to get health registry

    this is the first patch to fix the bug that cannot disable health
    checking tasks. Adds db api will be used by later patches.

    Partial-Bug: #1707871
    Change-Id: I03b6684650f000669ade748d03e2a925d560493c

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/490818
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=a6991090c571d6d4a5136b51ba62ebdbfd4c72b0
Submitter: Jenkins
Branch: master

commit a6991090c571d6d4a5136b51ba62ebdbfd4c72b0
Author: ruijie <email address hidden>
Date: Fri Aug 4 03:13:08 2017 -0700

    dispatch 'unregister_cluster' to specified engine

    We need to dispatch the request to specified engine to erase
    the runtime, so that the health checking task could be stopped/removed
    correctly.

    Change-Id: I08a5c3332401e7101b892869639a0e045710bb5c
    Closes-Bug: #1707871

Changed in senlin:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/senlin 4.0.0.0rc1

This issue was fixed in the openstack/senlin 4.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.