Deployment fails, mcollective agents didn't respond within the allotted time
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
High
|
Matthew Mosesohn | ||
6.0.x |
Invalid
|
High
|
Matthew Mosesohn | ||
6.1.x |
Invalid
|
High
|
Matthew Mosesohn |
Bug Description
I've seen this bug reported elsewhere, but using different setups. Also, I cannot see that a fix has been published. I see some references to 6.0.x-releases, but the only download option I can find is directly from software.
Env:
* 6.0-iso from software.
* HA deployment on Ubuntu with Neutron vlan, Ceph, Murano and Ceilometer
* Deployment run with 3 controllers, 3 compute nodes and 3 MongoDB nodes
Deployment halts regularly at 35-40% on the progress bar, when one of the nodes being deployed is set to "offline" in the Fuel admin console. Astute logs state:
2015-04-16 13:07:14 ERR [416] 676976f6-
ruby -r 'yaml' -e 'y = YAML.load_
y["nodes"] = YAML.load_
File.open(
puppet apply --logdest syslog --debug -e '$settings=
mcollective error: 676976f6-
2015-04-16 13:07:14 ERR [416] MCollective agents '5' didn't respond within the allotted time.
2015-04-16 13:05:13 ERR [416] 676976f6-
2015-04-16 13:05:13 ERR [416] MCollective agents '5' didn't respond within the allotted time.
Deployment fails after a while with "Error: Deployment has failed. Check these nodes: $node"
Changed in fuel: | |
importance: | Undecided → High |
milestone: | none → 6.0.2 |
status: | New → Confirmed |
Most likely the time lost its sync. RabbitMQ, the transport backend for mcollective, requires the time to be properly synchronized. Check the differences in the output of the "date" command on Fuel Master as well as node-5. That will give you some pointers.
You should make sure you have an Internet connection on your Fuel Master and enable NTP in Fuel Setup to avoid issues like this. It becomes a bigger problem later on in deployment if there is no time synchronization.