NRPE check for ceph-osd fails with: File '/var/lib/nagios/ceph-osd-checks' doesn't exist
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph OSD Charm |
Fix Committed
|
Medium
|
Unassigned |
Bug Description
The NRPE check removes the temporary output file created by the collector cronjob, which introduces a race condition that will make the check fail if the file is not present. While the commit where the check was added (faefe90ce6beb5
[0] https:/
[1] https:/
Changed in charm-ceph-osd: | |
status: | Triaged → In Progress |
Agreed that the race this could create can lead to false positives which causes alert fatigue and ultimately can render a check useless (although I can't help but wonder if there's an underlying env issue if it's failing in an env for hours on end as a side note).