Charm keeps restarting nagios-nrpe-server

Bug #1727084 reported by Haw Loeung
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Charm Helpers
Fix Released
Undecided
Haw Loeung
NRPE Charm
Invalid
Undecided
Unassigned

Bug Description

Hi,

As seem by a few others and myself, it seems that the charm keeps restarting nagios-nrpe-server. This is mostly triggered by the 'update-status' hook firing frequently but also happens when any of the others fire. /var/log/syslog looks like this:

| Oct 24 03:42:14 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[19505]: Caught SIGTERM - shutting down...
| Oct 24 03:42:14 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[20260]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 03:42:14 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[20268]: Starting up daemon
| Oct 24 03:47:22 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[20268]: Caught SIGTERM - shutting down...
| Oct 24 03:47:22 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[21056]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 03:47:22 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[21062]: Starting up daemon
| Oct 24 03:51:56 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[21062]: Caught SIGTERM - shutting down...
| Oct 24 03:51:56 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[21771]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 03:51:56 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[21777]: Starting up daemon
| Oct 24 03:57:42 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[21777]: Caught SIGTERM - shutting down...
| Oct 24 03:57:42 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[22953]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 03:57:42 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[22959]: Starting up daemon
| Oct 24 04:02:45 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[22959]: Caught SIGTERM - shutting down...
| Oct 24 04:02:45 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[23927]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 04:02:45 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[23933]: Starting up daemon
| Oct 24 04:07:45 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[23933]: Caught SIGTERM - shutting down...
| Oct 24 04:07:45 juju-b4bda0-prod-is-prometheus-XXX-4 nagios-nrpe-server[24831]: * Starting nagios-nrpe nagios-nrpe
| Oct 24 04:07:45 juju-b4bda0-prod-is-prometheus-XXX-4 nrpe[24837]: Starting up daemon

This causes occasional monitoring alerts because during restarts, nrpe checks usually return "CRITICAL;HARD;3;(Return code of 255 is out of bounds)".

We've fixed this with the older nrpe-external-master charm but should fix it here too. It should only really restart when there are changes to nrpe.cfg (/etc/nagios).

Haw Loeung (hloeung)
description: updated
Revision history for this message
Haw Loeung (hloeung) wrote :

This has been brought up in charms.reactive - https://github.com/juju-solutions/charms.reactive/issues/139

Haw Loeung (hloeung)
Changed in charm-helpers:
status: New → In Progress
assignee: nobody → Haw Loeung (hloeung)
Changed in nrpe-charm:
status: New → Invalid
Revision history for this message
Haw Loeung (hloeung) wrote :

Also workaround in charm-helpers for now - https://github.com/juju/charm-helpers/pull/36

Haw Loeung (hloeung)
Changed in charm-helpers:
status: In Progress → Fix Committed
Haw Loeung (hloeung)
Changed in charm-helpers:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.