snmpAgentDown blocks BoxDown event

Bug #1421126 reported by Ingeborg Hellemo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Fix Released
High
Morten Brekkevold

Bug Description

NAV 4.2.2

We have had a couple of cases where boxes were down, but NAV belived they were still up.

Symptoms:

The box is down and does not answer to ping.

Nav Status is "Uptime Up for X days Y hours", while Availability counts down

Both times we see a SnmpAgentDown imidiately before the box is marked as down.

eventengine.log:2015-02-11 00:01:35,892 [INFO nav.eventengine.plugins.snmpagentstate.SnmpAgentStateHandler] snmpAgentState start event for examplebox; declaring down in 240 seconds (if still unresolved)
eventengine.log:2015-02-11 00:01:53,792 [INFO nav.eventengine.plugins.boxstate.BoxStateHandler] examplebox is already down, ignoring duplicate start event
eventengine.log:2015-02-11 00:05:35,897 [INFO nav.eventengine.plugins.snmpagentstate.SnmpAgentStateHandler] examplebox: Posting snmpAgentDown alert

pping.log:[2015-02-11 00:01:53] pping.py:generateEvents:145 [Notice] examplebox (10.255.42.8) marked as down.

ipdevpoll.log:2015-02-11 00:01:35,464 [10887] [WARNING plugins.snmpcheck.snmpcheck] SNMP agent down on examplebox

Changed in nav:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Morten Brekkevold (mbrekkevold)
status: Confirmed → In Progress
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

fix consisting of multiple changesets merged here: https://nav.uninett.no/hg/stable/rev/b462c09dbbb9

This includes changes to snmpAgentState event handling, where snmpAgentDown alerts will be willfully held back if the netbox itself has stopped responding to ping while we were waiting for an snmpAgentUp resolve. It's kind of given that the box stops responding to snmp if its down, we don't need an extra alert for that.

Changed in nav:
status: In Progress → Fix Committed
milestone: none → 4.2.3
Changed in nav:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.