local nagios disk monitor doesn't ignore /snap mounts with 0 inodes free

Bug #1807457 reported by Drew Freiberger
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
NRPE Charm
Invalid
Undecided
Unassigned
Nagios Charm
Invalid
Undecided
Unassigned

Bug Description

Nagios plugins on bionic shouldn't detect disk full on /snap mounts.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Error state from check_disk on nagios unit.

DISK CRITICAL - free space: /snap/core/5742 0 MB (0% inode=0%): /snap/amazon-ssm-agent/784 0 MB (0% inode=0%)

Revision history for this message
Drew Freiberger (afreiberger) wrote :

It appears that this is actually a result of having amazon-ssm-agent snap installed, which I believe means this is specific to AWS cloud monitoring.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

aws-integrator is the charm that brings that snap into play, however this is related to prior bug on charm-nrpe to ignore snaps by default. I believe it's an upstream nagios package bug.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Also happens with canonical-livepatch charm which uses a snap.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

ultimately, we should update the upstream nagios plugin code for check_disk to always ignore snap loop mounts specifically that are of the type that snaps use (we don't want to ignore /snap in upstream if a user mounts a non-snapd based filesystem into /snap). This would ultimately help the wider community with snap adoption.

Revision history for this message
Haw Loeung (hloeung) wrote :

That's shipped out by the nrpe charm isn't it?

Changed in nagios-charm:
status: New → Invalid
Revision history for this message
Drew Freiberger (afreiberger) wrote :

Haw, this is specifically the nagios-charm local monitoring config.

/etc/nagios3/localhost_nagios2.cfg

define service{
        use generic-service ; Name of service template to use
        host_name bootstack-jagex-nagios/0
        service_description Disk Space
        check_command check_all_disks!20%!10%
        }

That then calls to /etc/nagios-plugins/config/disk.cfg:

# 'check_all_disks' command definition
define command{
        command_name check_all_disks
        command_line /usr/lib/nagios/plugins/check_disk -w '$ARG1$' -c '$ARG2$' -e
        }

I've found adding a -X squashfs on this check will skip snap squashfs mounts, but the default nagios package doesn't provide a $ARG3$ for additional options.

So, we'd need nagios charm to write out a new disk.cfg file that had a check_all_disks_with_added_args as a new defined command, then update the charm's template for localhost_nagios2.cfg to add the extra arg for -X squashfs for arg3 and call the check_all_disks_with_added_args.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

I was able to put in a temporary fix in place by adding -X squashfs to the check_all_disks command line in /etc/nagios-plugins/config/disk.cfg

Revision history for this message
Xav Paice (xavpaice) wrote :

Current versions of the charm are OK now without needing the extra switch.

Changed in charm-nrpe:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.