Fuel for OpenStack

Comment 15 for bug 1653737

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2017-01-30:

#15

I've checked the log and see strong connection problem between master node and node. Astute try to connect and check status of node from

2017-01-27 14:55:55 INFO [29965] Start puppet with timeout 5400 sec. Node 1, task provision_1, manifest /etc/puppet/shell_manifests/provision_1_manifest.pp
2017-01-27 14:56:06 DEBUG [29965] Retry #1 to run mcollective agent on nodes: '1'
2017-01-27 14:56:17 DEBUG [29965] Retry #2 to run mcollective agent on nodes: '1'
2017-01-27 14:56:28 DEBUG [29965] Retry #3 to run mcollective agent on nodes: '1'
2017-01-27 14:56:38 DEBUG [29965] Retry #4 to run mcollective agent on nodes: '1'
2017-01-27 14:56:48 DEBUG [29965] Retry #5 to run mcollective agent on nodes: '1'
2017-01-27 14:56:58 DEBUG [29965] Retry #6 to run mcollective agent on nodes: '1'

It is repeat tries 3 times which has 6 tries inside.

2017-01-27 15:02:09 DEBUG [29965] Puppet on node has undefined status. 2 retries remained. Node 1, task provision_1, manifest /etc/puppet/shell_manifests/provision_1_manifest.pp
2017-01-27 15:07:03 DEBUG [29965] Puppet on node has undefined status. 1 retries remained. Node 1, task provision_1, manifest /etc/puppet/shell_manifests/provision_1_manifest.pp
2017-01-27 15:11:57 DEBUG [29965] Puppet on node has undefined status. 0 retries remained. Node 1, task provision_1, manifest /etc/puppet/shell_manifests/provision_1_manifest.pp
2017-01-27 15:16:48 ERROR [29965] Node 1, task provision_1, manifest /etc/puppet/shell_manifests/provision_1_manifest.pp, status: undefined

Same behavior for all 4 nodes.

Resolution: Astute done 12 tries for every 4 nodes and it takes 21 minutes. Nodes do not answered.
Looks like we have serious problem with network/provision client/mcollective.

Interesting details: we have error on mcollective log on node:
2017-01-27T15:15:36.255295+00:00 debug: 15:15:35.807222 #1693] DEBUG -- : runner.rb:54:in `block in run' PLMC6: Message does not pass filters, ignoring

Looking for the code of mcollective: https://github.com/puppetlabs/marionette-collective/blob/master/lib/mcollective/runner.rb#L196 looks like we can get such error if node id changed somehow to unexpected.

Can you provide access to env where error was reproduced?

I've checked the log and see strong connection problem between master node and node. Astute try to connect and check status of node from

It is repeat tries 3 times which has 6 tries inside.

Same behavior for all 4 nodes.

Resolution: Astute done 12 tries for every 4 nodes and it takes 21 minutes. Nodes do not answered.
Looks like we have serious problem with network/provision client/mcollective.

Interesting details: we have error on mcollective log on node: 
2017-01-27T15:15:36.255295+00:00 debug: 15:15:35.807222 #1693] DEBUG -- : runner.rb:54:in `block in run' PLMC6: Message does not pass filters, ignoring

Can you provide access to env where error was reproduced?