Add retracer health check metrics script

Bug #1799563 reported by David Lawson
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Daisy
Fix Released
Medium
Brian Murray

Bug Description

When we did the initial deploy I added a little script to all the retracers that we call from telegraf to get health metrics for the various retracer processes, it'd be great to get that rolled into either the retracer code itself or the charm. The script is super simple, it's just this:

#!/bin/bash
arch=$1
ps aux | grep -v grep | grep -q retracer-${arch}
if [ $? = 0 ]; then
  echo check=retracer,architecture=${arch} up=1
else
  echo check=retracer,architecture=${arch} up=0
fi

It expects to be called with an arch, ala retracer_check.sh amd64. I've also added an telegraf config which it would be nice to have the charm drop in place when it gets a telegraf relation, right now this is /etc/telegraf/telegraf.d/retracer.conf:

[[inputs.exec]]
  commands = [
    "/home/ubuntu/retracer_check.sh amd64",
    "/home/ubuntu/retracer_check.sh i386",
    "/home/ubuntu/retracer_check.sh arm64",
    "/home/ubuntu/retracer_check.sh armhf",
  ]
  timeout = "5s"
  data_format = "influx"

Obviously the paths to the scripts would change to reflect wherever that script ends up in the codebase.

David Lawson (deej)
tags: added: canonical-is
tags: added: id-5bd74d034d44ca24e6ca5510
Changed in daisy:
status: New → In Progress
assignee: nobody → Brian Murray (brian-murray)
importance: Undecided → Medium
Revision history for this message
Brian Murray (brian-murray) wrote :

I've added the following hook to the daisy retracer charm:

 $ ls -lh hooks/juju-info-relation-joined
-rwxrwxr-x 1 bdmurray bdmurray 369 Nov 8 14:08 hooks/juju-info-relation-joined
[ 3:33PM 10881 ] [ bdmurray@impulse:~/source-trees/daisy-plucker-charms/xenial/daisy-retracer ]
 $ cat hooks/juju-info-relation-joined
#!/bin/bash

. $(dirname $0)/common

CONF=/etc/telegraf/telegraf.d/retracer.conf
echo "[[inputs.exec]]\n" > $CONF
echo " commands = [\n" >> $CONF
for arch in ${ARCHITECTURES}; do
  echo "\"${CODE_LOCATION}/tools/retracer_check.sh $arch\",\n" >> $CONF
done
echo " ]\n" >> $CONF
echo " timeout = \"5s\"\n" >> $CONF
echo " data_format = \"influx\"\n" >> $CONF

However it is never run on the unit despite the juju-info relationship being joined.

ubuntu@juju-3a2dd5-stg-error-tracker-9:~$ sudo grep juju-info /var/log/juju/unit-retracer-app-0.log
2018-11-14 18:23:44 INFO juju.worker.uniter.relation relations.go:495 joining relation "telegraf-retracer-app:juju-info retracer-app:juju-info"
2018-11-14 18:23:45 INFO juju.worker.uniter.relation relations.go:531 joined relation "telegraf-retracer-app:juju-info retracer-app:juju-info"
...

Do you have any ideas about how I can get the telegraf configuration file setup the way you'd like?

Revision history for this message
David Lawson (deej) wrote :

Honestly, I'd suggest checking to see if /etc/telegraf/ exists and if it does, check whether the file exists in /etc/telegraf/telegraf.d/retracer.conf, if it doesn't then write it and restart telegraf, if it does just pass. Messing around with relations in bash is no fun. Really we should sit down at some point and try to get this charm rewritten in reactive python and I think you'll have an overall better experience with it.

Revision history for this message
Junien F (axino) wrote :

You may just need to add juju-info to retracer-app's metadata.yaml ?

Revision history for this message
Brian Murray (brian-murray) wrote :

I've gone ahead and followed deej's suggestion and check can be found in daisy revision number 808 and the retracer-charm version 192.

Revision history for this message
Brian Murray (brian-murray) wrote :

Given that production is on daisy revision number 809, I'm going to set this to Fix Released.

Changed in daisy:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.