check_status_file nagios check marks services as not running when they are

Bug #1631170 reported by Dustin
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Charm Helpers
Fix Released
Undecided
Unassigned

Bug Description

charmhelpers installs /usr/local/lib/nagios/plugins/check_status_file.py which looks for the string "is running" on service status. A brand new juju deployed system has the following:

sudo service apache2 status
● apache2.service - LSB: Apache2 web server
   Loaded: loaded (/etc/init.d/apache2; bad; vendor preset: enabled)
  Drop-In: /lib/systemd/system/apache2.service.d
           └─apache2-systemd.conf
   Active: active (running) since Thu 2016-10-06 04:04:22 UTC; 17h ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 54
   Memory: 6.3M
      CPU: 25.849s
   CGroup: /system.slice/apache2.service
           ├─ 442 /usr/sbin/apache2 -k start
           ├─10073 /usr/sbin/apache2 -k start
           └─10074 /usr/sbin/apache2 -k start

Oct 06 04:04:33 juju-82d66b-0-lxd-0 apache2[2530]: *
Oct 06 04:04:33 juju-82d66b-0-lxd-0 systemd[1]: Reloaded LSB: Apache2 web server.
Oct 06 04:20:47 juju-82d66b-0-lxd-0 systemd[1]: Reloading LSB: Apache2 web server.
Oct 06 04:20:47 juju-82d66b-0-lxd-0 apache2[8415]: * Reloading Apache httpd web server apache2
Oct 06 04:20:47 juju-82d66b-0-lxd-0 apache2[8415]: *
Oct 06 04:20:47 juju-82d66b-0-lxd-0 systemd[1]: Reloaded LSB: Apache2 web server.
Oct 06 04:20:56 juju-82d66b-0-lxd-0 systemd[1]: Reloading LSB: Apache2 web server.
Oct 06 04:20:56 juju-82d66b-0-lxd-0 apache2[10057]: * Reloading Apache httpd web server apache2
Oct 06 04:20:56 juju-82d66b-0-lxd-0 apache2[10057]: *
Oct 06 04:20:56 juju-82d66b-0-lxd-0 systemd[1]: Reloaded LSB: Apache2 web server.

This causes the nagios check to fail. Affected charms are glance, nova, keystone, horizon, neutron, and basically anything running haproxy or apache2.

The output of a juju deployed nagios is:
CRITICAL 2016-10-06 21:33:22 0d 17h 38m 57s 4/4 /etc/init.d/apache2 CRITICAL - ● apache2.service - LSB: Apache2 web server

Related branches

Revision history for this message
Benjamin Kaehne (ben-kaehne) wrote :

it is missing the -e flag:
*/5 * * * * root /usr/local/lib/nagios/plugins/check_exit_status.pl -s /etc/init.d/mysql status > /var/lib/nagios/service-check-mysql.txt

should be:
*/5 * * * * root /usr/local/lib/nagios/plugins/check_exit_status.pl -e -s /etc/init.d/mysql status > /var/lib/nagios/service-check-mysql.txt

tags: added: canonical-bootstack
Revision history for this message
Benjamin Kaehne (ben-kaehne) wrote :

As per the script:

-e
 This is the "exitstaus" flag, it means check the exit status
 code instead of looking for a pattern in the output of the script.

Therefore we should use "-e". A merge proposal has been submitted:
https://code.launchpad.net/~ben-kaehne/charm-helpers/nagios-status-fix/+merge/313918

Stuart Bishop (stub)
Changed in charm-helpers:
status: New → Fix Committed
Changed in charm-helpers:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.