Hiera nodes.yaml is out of date after removing a node

Bug #1491027 reported by Swann Croiset
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Undecided
Ihor Kalnytskyi

Bug Description

Steps to reproduce:

1/ deploy 1 controller and 2 compute nodes
2/ remove one compute node

Expected results:
after step #2, /etc/hiera/nodes.yaml must not contains the removed node (on the controller node and the remaining compute node)

Actual results:
after step #2, /etc/hiera/nodes.yaml still contains the removed node

Other informations:
if a new compute node is added to the cluster, the nodes.yaml is correctly updated, this behavior must be the same when scaling down the cluster.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This is a duplicate of https://bugs.launchpad.net/bugs/1485505 because update_required parameter in a role definition only gets applied on all relevant nodes during node addition/reprovision, but not on node removal. There is an exception for removing controllers that is hardcoded, but it should apply to any role that has this update_required flag set.

Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

@Matthew, this isn't exactly the same point.

To give more context, we're developing the LMA Infrastructure Alerting plugin [1] that will offer a custom role for deploying the alerting server (eg Nagios server). When the OpenStack environment is modified after the initial deployment (eg when nodes are added or removed), the Nagios configuration needs to be updated otherwise it will trigger alerts for removed nodes or lacks alerting for new nodes. The problem is that there's no way to do it with MOS 7.0 and the update_required parameter doesn't help either.

To workaround the issue, we're planning to have a cron job running on the alerting node that will monitor the nodes.yaml file. When it changes, the job will re-run the necessary Puppet manifests. AFAICT this should work when the environment scales up (the file is correctly propagated then) but not when the environment scales down. Hence this bug being filled. Hope this is clearer.

Eventually, Fuel should be improved to offer this capability out of the box as it has been requested by other plugin's developers [2].

[1] https://github.com/stackforge/fuel-plugin-lma-infrastructure-alerting
[2] https://blueprints.launchpad.net/fuel/+spec/fuel-task-notify-other-nodes

Revision history for this message
Patrick Petit (patrick-michel-petit) wrote :

And there is even a Jira task in PROD for that already to well known shortcoming :-(
https://mirantis.jira.com/browse/PROD-826

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/220731

Changed in fuel:
assignee: nobody → Igor Kalnitsky (ikalnitsky)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/220731
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=c77efaaf1d8f29eff79d25f3f7d6bd3a496ae263
Submitter: Jenkins
Branch: master

commit c77efaaf1d8f29eff79d25f3f7d6bd3a496ae263
Author: Igor Kalnitsky <email address hidden>
Date: Sat Sep 5 11:29:08 2015 +0300

    Update nodes.yaml and /etc/hosts on node removal

    Currently Nailgun doesn't support declarative scale-down mechanism, and
    therefore there's no way to trigger some task execution if some node has
    been removed from the environment, and that's incredibly annoying for
    plugin developers.

    Plugin developers could come up with a workaround, and have a deal with
    /etc/hiera/nodes.yaml. Unfortunately, this file isn't updated on node
    removal event. This commit is going to fix it, and send two tasks on
    execution if case of node removal:

    * update /etc/hiera/nodes.yaml
    * update /etc/hosts

    Closes-Bug: #1475363
    Closes-Bug: #1491027

    Change-Id: Ie9ecc5cd1b7cc185a1fc694e728d8ecc571df23d
    Signed-off-by: Igor Kalnitsky <email address hidden>

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/221077

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/7.0)

Reviewed: https://review.openstack.org/221077
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=dab70b2098a9133145bb5c1bf088c3479d93626b
Submitter: Jenkins
Branch: stable/7.0

commit dab70b2098a9133145bb5c1bf088c3479d93626b
Author: Igor Kalnitsky <email address hidden>
Date: Sat Sep 5 11:29:08 2015 +0300

    Update nodes.yaml and /etc/hosts on node removal

    Currently Nailgun doesn't support declarative scale-down mechanism, and
    therefore there's no way to trigger some task execution if some node has
    been removed from the environment, and that's incredibly annoying for
    plugin developers.

    Plugin developers could come up with a workaround, and have a deal with
    /etc/hiera/nodes.yaml. Unfortunately, this file isn't updated on node
    removal event. This commit is going to fix it, and send two tasks on
    execution if case of node removal:

    * update /etc/hiera/nodes.yaml
    * update /etc/hosts

    Closes-Bug: #1475363
    Closes-Bug: #1491027

    Change-Id: Ie9ecc5cd1b7cc185a1fc694e728d8ecc571df23d
    Signed-off-by: Igor Kalnitsky <email address hidden>
    (cherry picked from commit c77efaaf1d8f29eff79d25f3f7d6bd3a496ae263)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.