Comment 11 for bug 1356954

Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Host node was not very loaded in last case.

We still believe what this problem generated by overloading of node.

Why?

If we discovery mcollective log in problem node we can see timeout problem in https://github.com/stackforge/fuel-astute/blob/master/mcagents/puppetd.rb#L261

2014-08-20T15:34:14.490127+01:00 debug: E, [2014-08-20T14:34:09.497417 #1143] ERROR -- : agent.rb:112:in `handlemsg' /usr/lib/ruby/1.8/timeout.rb:64:in `puppet_pid'
2014-08-20T15:34:14.490127+01:00 debug: /usr/share/mcollective/plugins/mcollective/agent/puppetd.rb:122:in `puppet_daemon_status'
2014-08-20T15:34:14.490127+01:00 debug: /usr/share/mcollective/plugins/mcollective/agent/puppetd.rb:100:in `set_status'
2014-08-20T15:34:14.490127+01:00 debug: /usr/share/mcollective/plugins/mcollective/agent/puppetd.rb:53:in `last_run_summary_action'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/rpc/agent.rb:88:in `send'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/rpc/agent.rb:88:in `handlemsg'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/agents.rb:126:in `dispatch'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/1.8/timeout.rb:67:in `timeout'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/agents.rb:125:in `dispatch'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/agents.rb:121:in `initialize'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/agents.rb:121:in `new'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/agents.rb:121:in `dispatch'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/runner.rb:84:in `agentmsg'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/runner.rb:58:in `run'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/runner.rb:53:in `loop'
2014-08-20T15:34:14.490127+01:00 debug: /usr/lib/ruby/vendor_ruby/mcollective/runner.rb:53:in `run'
2014-08-20T15:34:14.490127+01:00 debug: /usr/sbin/mcollectived:52

What can we do?

- increase mcollective retries for puppet queries (Astute);
- decrease frequency of puppet state report via mcollective(Astute);
- increase ubuntu node memory limits in CI (DevOps).