Nagios not reloading due to config errors after dedupe patch

Bug #1956541 reported by Paul Goins
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Nagios Charm
Won't Fix
Undecided
Unassigned

Bug Description

I've found on one particular environment that the new nagios charm's dedupe logic isn't quite working as intended.

Environment before upgrade was:
* Dupes observed across ceph-osd and nova-compute-kvm. Both are deployed to the same metals, and each app has its own nrpe subordinate.
* Upgraded nagios first to cs:nagios-46, waited for it to settle, then upgraded the nrpe apps to cs:nrpe-75. Not sure re: original revisions, but they were before dedupe logic was added to both.

Expected behavior: multiple entries for the same host, each with a unique prefix.

Observed behavior:
* Some entries with the old host entry, some additional entries with the prefix.
* Some records have a parent listed with the prefixed ID, however the parent record doesn't appear to exist on disk, thus nagios refuses to reload. (This also has the side effect of causing hooks take awhile to rerun because of attempting to wait for a nagios reload.)

Workaround: wait for all the hooks to run (even if they take awhile), then run the rewrite-peer-config action to do a clean rewrite of the config.

Revision history for this message
Paul Goins (vultaire) wrote :

Note: after running rewrite-peer-config (and without checking the nagios status beforehand, unfortunately), the duplicate records disappeared - instead, I have merged host records with no duplicate prefixes. Thus, whatever happened was likely a side effect of in-between state. Definitely a bug, but the rewrite-peer-config appears to have got things to a good state in the end.

Revision history for this message
Paul Goins (vultaire) wrote :

Also worth noting: trying to remove the same nagios unit (needed to redeploy it for proper tracking in MAAS) seems to trigger this bug as well. So, I don't think it's purely a transitional bug.

Revision history for this message
Eric Chen (eric-chen) wrote :

This charm is no longer being actively maintained. There is no further update of this isseu over 1 year. Therefore, we will close it.
Please consider using the new Canonical Observability Stack instead.
(https://charmhub.io/topics/canonical-observability-stack).

Changed in charm-nagios:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.