Multiple points of mcollective configuration on nodes

Bug #1585671 reported by Georgy Kibardin on 2016-05-25
52
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Fuel Sustaining

Bug Description

On a node mcollective is configured by both nailgun agent and cloud-init. This approach is race prone and already lead to a series of dirty fixes, see https://bugs.launchpad.net/fuel/+bug/1455489

Changed in fuel:
importance: Undecided → Medium
tags: added: tech-debt
Dmitry Pyzhov (dpyzhov) on 2016-05-25
Changed in fuel:
milestone: none → 10.0
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
status: New → Confirmed
tags: added: area-python
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Georgy Kibardin (gkibardin)
status: Confirmed → In Progress
Changed in fuel:
status: In Progress → Confirmed
Changed in fuel:
status: Confirmed → In Progress
Michael Polenchuk (mpolenchuk) wrote :

Raised to "High" as this bug affects baremetal setup at Fuel@OPNFV project.

Changed in fuel:
importance: Medium → High
Luca Cervigni (cervigni) wrote :

Trying to deploy on bare metal:
after trying to provision 3 nodes for the environment. its also weird because I got the mcollective not able to communicate error on all the three boxes. After getting the errror I reboot them all, rebootstrapped and one of them as a miracle passed the network check and I was able to provision it correctly while the others, same rpc error.
I made another environment and I put the two not working in that environment, and as before, one of the two started working and provisioned correctly, the other fails.
With mco ping and see only the working node, if I go in the bootstrapped nodes and I bring up manually mcollecting, I can ping it, but as soon as I launch the provisioning, the services goes down again
restarting mcollecting as workaround on the nodes for me does not solve anything

Reviewed: https://review.openstack.org/376530
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=5877315c7fd20f603a35d1ff79a3b2c1407a54fc
Submitter: Jenkins
Branch: master

commit 5877315c7fd20f603a35d1ff79a3b2c1407a54fc
Author: Georgy Kibardin <email address hidden>
Date: Mon Sep 26 17:10:53 2016 +0300

    Configure mcollective identity in startup scripts

    As a part of handing over control over mcollective from nailgun agent to
    startup scripts lets get rid of of nodiscover file flag and perform
    mcollective configuration and start

    Change-Id: Ia2f984570b38642b1090f6483ed3fa78958550c5
    Partial-Bug: #1585671

Reviewed: https://review.openstack.org/432896
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=b419f9a21e5e0d53b58213b9eb1ad685f4f49927
Submitter: Jenkins
Branch: master

commit b419f9a21e5e0d53b58213b9eb1ad685f4f49927
Author: Georgy Kibardin <email address hidden>
Date: Mon Feb 13 10:53:20 2017 +0300

    Remove nodiscover file

    While new nailgun agent is not committed yet we need to explicitly remove
    /etc/nailgun-agent/nodiscover to unblock it from starting

    Change-Id: I3199591e7da95bac221be3554f97b00757db160a
    Partial-Bug: #1585671

Reviewed: https://review.openstack.org/431601
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=16318a98ebf70d40f6e224678197106b90d862dd
Submitter: Jenkins
Branch: stable/newton

commit 16318a98ebf70d40f6e224678197106b90d862dd
Author: Georgy Kibardin <email address hidden>
Date: Mon Sep 26 17:10:53 2016 +0300

    Configure mcollective identity in startup scripts

    As a part of handing over control over mcollective from nailgun agent to
    startup scripts lets get rid of of nodiscover file flag and perform
    mcollective configuration and start

    Change-Id: Ia2f984570b38642b1090f6483ed3fa78958550c5
    Partial-Bug: #1585671
    (cherry picked from commit 5877315c7fd20f603a35d1ff79a3b2c1407a54fc)

tags: added: in-stable-newton

Reviewed: https://review.openstack.org/434211
Committed: https://git.openstack.org/cgit/openstack/fuel-nailgun-agent/commit/?id=b2d45830c009d13f29df8660918e25167164011f
Submitter: Jenkins
Branch: stable/newton

commit b2d45830c009d13f29df8660918e25167164011f
Author: Georgy Kibardin <email address hidden>
Date: Mon Sep 26 16:19:47 2016 +0300

    Do not change mcollective config

    The idea is to stop changing mcollective config and restarting it from
    nailgun agent. This causes a lot of problems related to mcollective
    service restarted at wrong time. From now at the bootstrap stage
    mcollective is expected to be configured and started by startup scripts
    while in bootstrap and by cloud-init at first boot into provisioned
    node.

    Change-Id: Ic8e31d6381d8ffb8f7fdfd1aa8ebc655bb4535ec
    Partial-Bug: #1585671
    Depends-On: Ia2f984570b38642b1090f6483ed3fa78958550c5

Reviewed: https://review.openstack.org/376472
Committed: https://git.openstack.org/cgit/openstack/fuel-nailgun-agent/commit/?id=b33cfab5c3594b06cfe3e647672e243303041ad8
Submitter: Jenkins
Branch: master

commit b33cfab5c3594b06cfe3e647672e243303041ad8
Author: Georgy Kibardin <email address hidden>
Date: Mon Sep 26 16:19:47 2016 +0300

    Do not change mcollective config

    The idea is to stop changing mcollective config and restarting it from
    nailgun agent. This causes a lot of problems related to mcollective
    config restarted at wrong time. From now at the bootstrap stage
    mcollective is expected to be configured and started by startup scripts
    while in bootstrap and by cloud-init at first boot into provisioned
    node.

    Change-Id: Ic8e31d6381d8ffb8f7fdfd1aa8ebc655bb4535ec
    Partial-Bug: #1585671
    Depends-On: Ia2f984570b38642b1090f6483ed3fa78958550c5

Reviewed: https://review.openstack.org/436840
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=fe2a888abd833cdc2bb07b24001c26cfe91b1b76
Submitter: Jenkins
Branch: master

commit fe2a888abd833cdc2bb07b24001c26cfe91b1b76
Author: Georgy Kibardin <email address hidden>
Date: Wed Feb 22 12:28:40 2017 +0300

    Remove nodiscover file from bootstrap

    It is not used anymore during bootstrap boot phase to syncronize mcollective
    service startup.

    Change-Id: I4997e3cb8aa57f9ffac22f8d7c01de8447b640be
    Partial-Bug: #1585671
    Depends-On: Ic8e31d6381d8ffb8f7fdfd1aa8ebc655bb4535ec

Reviewed: https://review.openstack.org/442428
Committed: https://git.openstack.org/cgit/openstack/fuel-nailgun-agent/commit/?id=71313a990ff9060e83562e6f336b2ae9d24b2880
Submitter: Jenkins
Branch: stable/ocata

commit 71313a990ff9060e83562e6f336b2ae9d24b2880
Author: Georgy Kibardin <email address hidden>
Date: Mon Sep 26 16:19:47 2016 +0300

    Do not change mcollective config

    The idea is to stop changing mcollective config and restarting it from
    nailgun agent. This causes a lot of problems related to mcollective
    config restarted at wrong time. From now at the bootstrap stage
    mcollective is expected to be configured and started by startup scripts
    while in bootstrap and by cloud-init at first boot into provisioned
    node.

    Change-Id: Ic8e31d6381d8ffb8f7fdfd1aa8ebc655bb4535ec
    Partial-Bug: #1585671
    (cherry picked from commit b33cfab5c3594b06cfe3e647672e243303041ad8)

tags: added: in-stable-ocata

Reviewed: https://review.openstack.org/442424
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=7069b68ff80e8eb2f8c9396d1056cac11edb4e64
Submitter: Jenkins
Branch: stable/ocata

commit 7069b68ff80e8eb2f8c9396d1056cac11edb4e64
Author: Georgy Kibardin <email address hidden>
Date: Wed Feb 22 12:28:40 2017 +0300

    Remove nodiscover file from bootstrap

    It is not used anymore during bootstrap boot phase to syncronize mcollective
    service startup.

    Change-Id: I4997e3cb8aa57f9ffac22f8d7c01de8447b640be
    Partial-Bug: #1585671
    Depends-On: Ic8e31d6381d8ffb8f7fdfd1aa8ebc655bb4535ec
    (cherry picked from commit fe2a888abd833cdc2bb07b24001c26cfe91b1b76)

Changed in fuel:
assignee: Georgy Kibardin (gkibardin) → Fuel Sustaining (fuel-sustaining-team)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers