aggregated AFD are sent to Nagios but not displayed

Bug #1664604 reported by Swann Croiset
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StackLight
Confirmed
High
Simon Pasquier

Bug Description

All AFD which are evaluated against already aggregated metrics are sent to Nagios but are not configured.

In nagios log, those entries appear continuously :

[2017-02-14 14:22:12] Warning: Passive check result was received for service 'keystone-public-api-check.vip' on host 'node-7', but the service could not be found!

This concerns all AFD related to metrics listed in metrics.yaml (https://github.com/openstack/fuel-plugin-lma-collector/blob/master/deployment_scripts/puppet/modules/fuel_lma_collector/templates/metrics.yaml.erb):

cinder-api-check.vip
cinder-scheduler.workers
cinder-v2-api-check.vip
cinder-volume.workers
glance-api-check.vip
heat-api-check.vip
heat-cfn-api-check.vip
influxdb-api-check.vip
keystone-public-api-check.vip
neutron-api-check.vip
neutron-dhcp.workers
neutron-l3.workers
neutron-metadata.workers
neutron-openvswitch.workers
nova-api-check.vip
nova-cert.workers
nova-compute.workers
nova-conductor.workers
nova-consoleauth.workers
nova-free-memory.nova-free-memory
nova-free-vcpu.nova-free-vcpu
nova-scheduler.workers
rabbitmq-cluster.pacemaker
swift-api-check.vip
swift-s3-api-check.vip

Workaround
==========

On the infrastructure-alerting nodes, create /etc/rsyslog.d/00-discard-nagios-passive-checks.conf with the following content

  # Discard Nagios messages about unsollicited passive check results
  :msg, contains, "Warning: Passive check result was received for service" ~

Then restart rsyslogd:

 /etc/init.d/rsyslogd restart

Revision history for this message
Denis Klepikov (dklepikov) wrote :
tags: added: customer-found support
Revision history for this message
Denis Klepikov (dklepikov) wrote :

It was reproduced on Fuel 9.2 StacLight 1.0

Deploy Fuel 9.0, upgrade it to 9.2, install SL plugins, deploy env.
Then
Check that you have the same files alarming.yaml and clusters.yaml on controllers and SL nodes

for i in $(fuel nodes | grep controller | awk {'print $1'}); do ssh node-$i 'md5sum /etc/hiera/override/alarming.yaml; md5sum /etc/hiera/override/clusters.yaml'; done

for i in $(fuel nodes | grep 'infrastructure_alerting' | awk {'print $1'}); do ssh node-$i
'md5sum /etc/hiera/override/alarming.yaml; md5sum /etc/hiera/override/clusters.yaml'; done

cp /etc/nagios3/conf.d/lma_* .

rm -f /etc/nagios3/conf.d/lma_*

re-apply nagios manifest:
puppet apply --modulepath=/etc/fuel/plugins/lma_infrastructure_alerting-1.0/puppet/modules:/etc/puppet/modules /etc/fuel/plugins/lma_infrastructure_alerting-1.0/puppet/manifests/nagios.pp

Open Kibana UI and type into search field: "service could not be found", press search

May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'keystone-public-api-check.vip' on host 'node-1', but the service could not be found! env-1
May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'glance-api-check.vip' on host 'node-1', but the service could not be found! env-1
May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'rabbitmq-cluster.pacemaker' on host 'node-1', but the service could not be found! env-1
May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'nova-consoleauth.workers' on host 'node-1', but the service could not be found! env-1
May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'nova-conductor.workers' on host 'node-1', but the service could not be found! env-1
May 8th 2017, 17:50:42.000 node-6 system.messages nagios3 INFO Warning: Passive check result was received for service 'neutron-openvswitch.workers' on host 'node-1', but the service could not be found! env-1

no longer affects: lma-toolchain/1.0
Changed in lma-toolchain:
status: New → Confirmed
Changed in lma-toolchain:
importance: Undecided → Low
tags: added: ct1
Andrii Petrenko (aplsms)
Changed in lma-toolchain:
importance: Low → High
description: updated
Changed in lma-toolchain:
assignee: nobody → Simon Pasquier (simon-pasquier)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.