Write a diagnostics script

Bug #1547084 reported by Simon Pasquier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StackLight
Fix Released
Wishlist
Swann Croiset

Bug Description

It would be useful to have a utility script that would report the status of the various LMA services as well as the logs, the system metrics and the hardware characteristics of the node. This script should be installed on every deployed node during the deployment of the LMA collector. The script should be generic so that it supports all StackLight roles.

In addition, another script should be written: it would be executed from the Fuel node, run the first script on every node and gather all the data on the Fuel node.

List of things to collect (more can be added of course):
- log files from the various StackLight services: /var/log/lma_collector.log, /var/log/collectd.log, /var/log/elasticsearch/*, /var/log/influxdb/* and /var/nagios/nagios.log.
- configuration directories for the various StackLight services.
- output of /proc/cpuinfo
- output of the following commands (*):
  - uptime
  - dmesg | tail -100
  - vmstat 1 10
  - mpstat -P ALL 1 10
  - pidstat 1 10
  - iostat -xz 1 10
  - sar -n DEV 1 10
  - sar -n TCP,ETCP 1 10
  - lshw
  - df -h
  - pcs status
  - status of the StackLight services

(*) List inspired by http://techblog.netflix.com/2015/11/linux-performance-analysis-in-60s.html. sysstat package would need to be installed first.

All the data should be put into an archive that the Fuel node (or something else) can retrieve later on.

Note that there's already a blueprint for tracking this feature [1] but this feature request is specifically for LMA.

[1] https://blueprints.launchpad.net/fuel/+spec/testing-utility-for-plugins

summary: - Write a utility script reporting the status of the LMA services
+ Write a utility script for helping with diagnosis
Revision history for this message
Éric Lemoine (elemoine) wrote : Re: Write a utility script for helping with diagnosis

I am not sure it is relevant but a diagnostics tool was recently developed by a Mirantis employee: https://github.com/adobdin/timmy.

description: updated
Changed in lma-toolchain:
milestone: none → 1.0.0
summary: - Write a utility script for helping with diagnosis
+ Write a diagnostics script
Changed in lma-toolchain:
milestone: 1.0.0 → 0.10.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (master)

Fix proposed to branch: master
Review: https://review.openstack.org/318079

Changed in lma-toolchain:
assignee: LMA-Toolchain Fuel Plugins (mos-lma-toolchain) → Swann Croiset (swann-w)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (master)

Reviewed: https://review.openstack.org/318079
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=373640672dbc5c145fb1539f6fbf2ff18c3f099f
Submitter: Jenkins
Branch: master

commit 373640672dbc5c145fb1539f6fbf2ff18c3f099f
Author: Swann Croiset <email address hidden>
Date: Tue May 17 17:35:08 2016 +0200

    Add a simple diagnostic script

    A script is installed on all nodes to collect various information and perform
    basic tests regarding LMA components.
    All information gathered are stored locally in /var/lma_diagnostics.

    From the Fuel master node, the contrib/tools/diagnostic.sh script launches the
    diagnostic script on all nodes and downloads all data into /var/lma_diagnostics.

    Fixes-bug: #1547084
    Change-Id: I37e36df23bc98109b7a86db63e5243cc264d2f95

Changed in lma-toolchain:
status: In Progress → Fix Committed
Changed in lma-toolchain:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.