Hi,
We deployed Fuel 7.0 along with LMA-Toolchain. everything was working properly for about a month,then suddenly we are no longer seeing the data on both Grafana and Kibana dashboard.
the lma_collector service is running on all the nodes, below are the logs picked up from the monitoring node and the controller nodes:
1. Monitoring node:
root@node-112:~# tail /var/log/lma_collector.log
2016/02/14 07:17:55 Plugin 'aggregator_tcpoutput' error: writing to 192.168.0.2:5565: write tcp 192.168.0.2:5565: broken pipe
2016/02/14 07:17:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60036,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:18:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60001,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:18:55 Plugin 'aggregator_tcpoutput' error: writing to 192.168.0.2:5565: write tcp 192.168.0.2:5565: broken pipe
2016/02/14 07:19:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60002,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:20:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60001,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:20:55 Plugin 'aggregator_tcpoutput' error: writing to 192.168.0.2:5565: write tcp 192.168.0.2:5565: broken pipe
2016/02/14 07:21:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60001,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:22:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60001,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:23:55 Plugin 'elasticsearch_output' error: ElasticSearch server reported error within JSON: {"took":60001,"errors":true,"items":[{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}},{"index":{"_index":"log-2016.02.14","_type":"message","_id":null,"status":503,"error":"ProcessClusterEventTimeoutException[failed to process cluster event (create-index [log-2016.02.14], cause [auto(bulk api)]) within 1m]"}}]}
2016/02/14 07:23:55 Plugin 'aggregator_tcpoutput' error: writing to 192.168.0.2:5565: write tcp 192.168.0.2:5565: broken pipe
2. on the primary controller node:
root@node-110:~# crm resource status lma_collector
resource lma_collector is running on: node-110.cdta.net
root@node-110:~# tail /var/log/lma_collector.log
2016/02/14 08:44:26
2016/02/14 08:44:56 Diagnostics: 9 packs have been idle more than 120 seconds.
2016/02/14 08:44:56 Diagnostics: (inject) Plugin names and quantities found on idle packs:
2016/02/14 08:44:56 Diagnostics: 39 packs have been idle more than 120 seconds.
2016/02/14 08:44:56 Diagnostics: (input) Plugin names and quantities found on idle packs:
2016/02/14 08:44:56 Diagnostics: influxdb_accumulator_filter: 9
2016/02/14 08:44:56
2016/02/14 08:44:56 Diagnostics: http_metrics_filter: 20
2016/02/14 08:44:56 Diagnostics: elasticsearch_output: 32
2016/02/14 08:44:56
Please help
Thank you,
Hamza
Hi,
I managed to get the data back on both dashboards by restarting the hekad processes on all the nodes:
/etc/init.d/heka stop
/etc/init.d/heka start
But still don't know the root cause of the problem.
Regards,
Hamza