Node blacklisting can't be reverted/reseted

Bug #1741053 reported by Cédric Jeanneret deactivated
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
Medium
Steven Hardy
tripleo
Fix Released
High
Unassigned

Bug Description

Dear Stackers,

When we follow the process for a node removal (for example a compute), its id id blacklisted, preventing any new deploy of the said compute.

It appears the backlisting is only in an "append" mode, and can't be properly reset:
https://github.com/openstack/heat/blob/master/heat/engine/resources/openstack/heat/resource_group.py#L333

It would be good to get a way to reset the blacklist, for example if we removed a node in order to replace it (and didn't use the "mark unhealthy" worklow for some reason).

Thank you!

Cheers,

C.

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → rocky-3
Revision history for this message
Steven Hardy (shardy) wrote :

I think there may be two bugs here:

1. After openstack overcloud node delete, any subsequent update appears to reset the e.g ComputeRemovalPolicies parameter in the heat environment. This is probably a bug in tripleo-common I think, and it makes it somewhat confusing figuring out why some node is permanently blacklisted.

2. In heat, the removal_policies is sticky (which is why the above bug doesn't cause problems), we only ever append to the list in heat here:

https://github.com/openstack/heat/blob/master/heat/engine/resources/openstack/heat/resource_group.py#L333

Probably we need two fixes, one to tripleo-comon to persist the *RemovalPolicies parameters set by tripleo on node-delete, e.g update the plan with the parameter and ensure it's visible unless overridden by an operator explicitly, and one to heat which adds a new interface that optionally allows the removal_policies list to be explicitly interpreted vs added to the internal state.

Changed in heat:
status: New → Triaged
importance: Undecided → Medium
milestone: none → queens-3
assignee: nobody → Steven Hardy (shardy)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/530948

Changed in heat:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/530948
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=7a42ec8657f2524265dc78f011f28236309975eb
Submitter: Zuul
Branch: master

commit 7a42ec8657f2524265dc78f011f28236309975eb
Author: Steven Hardy <email address hidden>
Date: Wed Jan 3 14:22:52 2018 +0000

    Add removal_policies_mode to ResourceGroup

    This enables choice over the current behavior which is to always append
    to the resource data blacklist, or overwrite it (which is sometimes needed
    e.g if you decide you want to reuse that group index)

    The default behavior is unchanged, but the new behavior can be selected
    via the "update" value.

    Change-Id: I1157627b07d98dd079657c320ad783a3ba5bce81
    Closes-Bug: #1741053

Changed in heat:
status: In Progress → Fix Released
Changed in tripleo:
milestone: rocky-3 → queens-3
Changed in tripleo:
milestone: queens-3 → queens-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/heat 10.0.0.0b3

This issue was fixed in the openstack/heat 10.0.0.0b3 development milestone.

Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/664444

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/664445

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/664446

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/664444
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b4779ec178d1087406ca05f75488d90d3af851c3
Submitter: Zuul
Branch: stable/stein

commit b4779ec178d1087406ca05f75488d90d3af851c3
Author: Rabi Mishra <email address hidden>
Date: Wed May 22 10:14:59 2019 +0530

    Add {{role.name}}RemovalPoliciesMode parameter

    This adds a new paramter to reset/update existing blacklisted nodes
    set prior to this update with either node delete or using parameter
    {{role.name}}RemovalPolicies.

    Related-Bug: #1741053
    Change-Id: I870d7605e35a7cda76d9de0e8830a87c898533d6
    (cherry picked from commit b3b9b44dae232447d5d3fc181a4f7922ef85b597)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/664445
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=dcb9df6ceb00be1ca156f261adb9595e7323ad91
Submitter: Zuul
Branch: stable/rocky

commit dcb9df6ceb00be1ca156f261adb9595e7323ad91
Author: Rabi Mishra <email address hidden>
Date: Wed May 22 10:14:59 2019 +0530

    Add {{role.name}}RemovalPoliciesMode parameter

    This adds a new paramter to reset/update existing blacklisted nodes
    set prior to this update with either node delete or using parameter
    {{role.name}}RemovalPolicies.

    Related-Bug: #1741053
    Change-Id: I870d7605e35a7cda76d9de0e8830a87c898533d6
    (cherry picked from commit b3b9b44dae232447d5d3fc181a4f7922ef85b597)

tags: added: in-stable-rocky
tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/664446
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=7e6b6bafa030cf9ee6bfef1ad76584655bdb2dd8
Submitter: Zuul
Branch: stable/queens

commit 7e6b6bafa030cf9ee6bfef1ad76584655bdb2dd8
Author: Rabi Mishra <email address hidden>
Date: Wed May 22 10:14:59 2019 +0530

    Add {{role.name}}RemovalPoliciesMode parameter

    This adds a new paramter to reset/update existing blacklisted nodes
    set prior to this update with either node delete or using parameter
    {{role.name}}RemovalPolicies.

    Related-Bug: #1741053
    Change-Id: I870d7605e35a7cda76d9de0e8830a87c898533d6
    (cherry picked from commit b3b9b44dae232447d5d3fc181a4f7922ef85b597)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.