monitors config option not applied with external nagios master

Bug #1687116 reported by Billy Olsen
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
NRPE Charm
Fix Released
High
Matt Rae

Bug Description

It appears that the 'monitors' config option is only applied when the monitor relation is present, making it impossible to use an externally defined nagios master and user specified monitors.

Moving the get_user_defined_monitors from the nrpe_helpers.MonitorsRelation to nrpe_helpers.PrincipleRelation should make this a more generally applicable config option.

Related branches

Paul Gear (paulgear)
Changed in nrpe-charm:
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
Paul Gear (paulgear) wrote :

Next steps are to reproduce the symptoms. Discussed with @billy-olsen and he's going to look for a test case or example deployment.

Changed in nrpe-charm:
status: New → Incomplete
Revision history for this message
Doug Parrish (dparrish) wrote :

I have been working with Billy on this request. The thought is the following could be the YAML supplied in the "monitors", or similar new, config option:

nrpe:
    ctl_check_procs:
        command: /usr/local/nagios/libexec/check_procs -C $ARG1$ -w $ARG2$ -c $ARG3$
    ctl_check_users:
        command: /usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
    ctl_check_load:
        command: /usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
    ctl_check_disk:
        command: /usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$

 .. which could yield

/etc/nagios/nrpe.d/ctl_check_procs.cfg:
command[ctl_check_procs]=/usr/local/nagios/libexec/check_procs -C $ARG1$ -w $ARG2$ -c $ARG3$

/etc/nagios/nrpe.d/ctl_check_users.cfg:
command[ctl_check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$

/etc/nagios/nrpe.d/ctl_check_load.cfg:
command[ctl_check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$

/etc/nagios/nrpe.d/ctl_check_disk.cfg:
command[ctl_check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$

 .. or, could yield

/etc/nagios/nrpe.d/charm-monitors-option.cfg:
command[ctl_check_procs]=/usr/local/nagios/libexec/check_procs -C $ARG1$ -w $ARG2$ -c $ARG3$
command[ctl_check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$
command[ctl_check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$
command[ctl_check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$

Note that in the above sample, /usr/local/nagios/libexec/ is not normally a directory that exists on a Ubuntu 16.04 system. It would seem reasonable that the charm would create the config entries without caring, or possibly giving a warning (in the unit agent log?) if the command file doesn't exist. The charm could empower the user to no only supply custom entries, but also custom command files too, e.g. in a tarball which would entail a new option to specify said tarball and then distribute to each unit and untar.

Revision history for this message
Matt Rae (mattrae) wrote :

I was able to investigate and reproduce this issue. When configuring monitors=<monitor syntax> as well as setting export_nagios_definitions=True, I am expecting the nrpe charm to write out nagios definitions in /var/lib/nagios/export for those monitors. So far I am not seeing any of the nagios definitions written out to /var/lib/nagios/export for the monitors even if there is a relation with nagios and I see the monitors in the nagios dashboard.

It appears you're requesting a way to define custom nrpe hooks via a config option. I agree that I think that should be a new config option and a new feature.

So two issues for bug:
1. when using monitors=<monitor syntax> and export_nagios_definitions=True, the charm is not writing out nagios definitions to /var/lib/nagios/export as expected.

2. requesting a new feature to configure custom nrpe hooks via a charm config option

Changed in nrpe-charm:
status: Incomplete → Confirmed
Revision history for this message
Doug Parrish (dparrish) wrote :

Just to make a distinction since nrpe is a charm as well as a sub-system of Nagios: Customer is seeking custom nrpe "checks" as opposed to "hooks" which is a construct of the charm. A check is called by the Nagios server. Having the ability to have Juju manage customer-specific NRPE checks, will discourage the user from managing NRPE via both Juju and something else like Ansible.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

I'm playing with this charm code

but I couldn't find any writing out code to /var/lib/nagios/export/ for service (checking things)

I'm adding code that collecting monitor info and adding it to /var/lib/nagios/export/ like below [0]( this is only draft )

and rsync in nagios server like [1]

I think I can make this works somehow.

but not sure this idea is right.

and this seems adding feature not a bug fix

[0]
define service {
        check_command {{ command }}
        service_description {{ nagios_service_description }}
        use generic-service
        host_name {{ nagios_hostname }}
}

[1]
rsync -avzh root@10.0.11.111::external-nagios /etc/nagios3/conf.d/

Revision history for this message
Paul Gear (paulgear) wrote :

@xtrusia: Better to use the NRPE module in charmhelpers than implement it yourself: http://bazaar.launchpad.net/~charm-helpers/charm-helpers/devel/view/head:/charmhelpers/contrib/charmsupport/nrpe.py

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

@paulgear, I found why I couldn't find "define service" part on nrpe charm,

contrib in charmhelper is not included on nrpe charm.

I failed add-relation nrpe-external-master ubuntu e.g ( I just tried to do this though doc said it is going to be included to nrpe )

do i need to use nrpe-external-master or enhance nrpe to do external-master job?

Revision history for this message
Matt Rae (mattrae) wrote :

In investigating this issue I've found that local monitors were not getting added in /etc/nagios/nrpe.d on each nrpe host.

See the code I'm working on here:
https://git.launchpad.net/~mattrae/nrpe-charm/diff/

The nrpe charm converts local monitors in the monitors dict to remote monitors. But only local monitors are added as checks /etc/nagios/nrpe.d. The result is that nagios's nrpe check will fail because it isn't configured in /etc/nagios/nrpe.d on the nrpe host.

As a solution I added the local monitors back to the monitor dict, which results in the checks being configured in /etc/nagios/nrpe.d, but also adds a duplicate check in the nagios dashboard.

I'm not sure of the best way to solve this problem.

Also I'm toying with adding support for a custom checktype that can be used in the monitors config option. For example a monitors yaml like the following could be used:

juju config nrpe monitors="
monitors:
  local:
    custom:
      check_load:
        check: check_load
        params: '-w 4 -c 10'
        desc: 'system load'
        plugin_path: '/usr/local/nagios/libexec'
"

If anybody has an idea regarding the issue with the local monitors not being configured properly let me know. To reproduce the bug, add a local monitor in the monitors config option, then verify that the check is failing in the nagios dashboard and that no check is configured in /etc/nagios/nrpe.d on the nrpe node. Example monitor below to reproduce the bug:

juju config nrpe monitors="
monitors:
  local:
    procrunning:
      rsyslogd:
        min: 1
        max: 1
        executable: rsyslogd
"

Matt Rae (mattrae)
Changed in nrpe-charm:
assignee: nobody → Matt Rae (mattrae)
Revision history for this message
Felipe Reyes (freyes) wrote :
Xav Paice (xavpaice)
Changed in charm-nrpe:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.