Fuel for OpenStack

need to update rabbit_hosts on non-controller nodes after deploying controllers

Bug #1368445 reported by Andrew Woodward on 2014-09-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Committed	High	Dima Shulyak	Fuel for OpenStack 6.0

Bug Description

rabbit_hosts will be a list of controllers at deployment time of that role
so if I deploy one controller in HA mode, there will only be one item in it
if i deploy 3 controllers, it will have 3 items in it
but if i deploy computes when there is only one controller
and then add more controllers later
and the compute role (or rabbit_hosts its self) isn't updated later, then it will still have the first controller in the list.
this is equally true when a single controller might be replaced.

this is fine for controllers, because all controllers re-run puppet controller role when a single controller role is deployed.

however, this isn't ok for roles like computes which will have an old list of controllers

this could be solved by running the compute role again, or just running a short manifest like for /etc/hosts via astute

for computes we need to ensure that we update neutron.conf and nova.conf and restart neutron-ovs-agent and nova-compute. Maybe others too

We need to check if other roles need this aswell

More info in the IRC conversation: http://irclog.perlgeek.de/fuel-dev/2014-09-11 (xarses, mihgen)

See original description

Revision history for this message

Mike Scherbakov (mihgen) wrote on 2014-09-11:

Set High priority as it affects operations story. I believe we should have test for it. Can be easily checked by the following:

Lightweight test, can be considered as extension to HA-scale up test:
1) Run HA deployment with one controller
2) Add one/two more controllers, deploy
3) Check rabbit_hosts in compute config file, it must have >1 IP address in the list (2 or 3, depends how many controllers finally in env)

Heavy test, but can handle more issues:
1) Run HA deployment with one controller and one compute, run OSTF
2) Add two more controllers, run OSTF
3) Destroy first(initial) controller, run OSTF.
OSTF should pass in all 3 steps. It won't in step #3 at the current moment without redeployment of compute nodes on every controller-add.

Changed in fuel:
importance:	Undecided → High

Mike Scherbakov (mihgen) on 2014-09-11

description:

updated

Andrew Woodward (xarses) on 2014-09-11

description:

updated

Andrew Woodward (xarses) on 2014-09-11

description:

updated

Revision history for this message

Dima Shulyak (dshulyak) wrote on 2014-09-23:

We can introduce this is as additional step for post deployment in astute
it will be quite easy right now, but will add additional complexity into astute,
or we can add this is as additional tasks in our granular deployment feature,
which will be executed on computes when installing additional controllers.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-11-03: Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/132549

Changed in fuel:
assignee:	Fuel Python Team (fuel-python) → Dima Shulyak (dshulyak)
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-11-18: Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/132549
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=1d3982f606b5d0d0cba78b52a5806abee6a97918
Submitter: Jenkins
Branch: master

commit 1d3982f606b5d0d0cba78b52a5806abee6a97918
Author: Dima Shulyak <email address hidden>
Date: Mon Nov 3 13:37:37 2014 +0200

Redeploy cinder/compute nodes if new controller added

    Addition of new controller affects cluster messaging,
    particularly cinder/computes nodes which is reliant on
    rabbitmq_hosts settings

    In current orchestration model - the only way to
    fix configuration on nodes is to redeploy them.
    Configuration will be applied from /etc/astute.yaml and
    it is not enough to simply add additional PostDeployment
    method in astute

    Introduced additional setting for roles_metadata:
    - update_required
      should store list of roles that dependends on this role

    On deployment stage:
    - make update_required list for the whole cluster
    - select ready nodes without pending_roles and deploy them

No migration added to keep behaviour on old clusters as it is

DocImpact
Closes-Bug: 1368445

Change-Id: I1735a8b06531018b1240726f5faa4f7ce6e6a631

Changed in fuel:
status:	In Progress → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.