NAV spams users with false coldStart alerts

Bug #258262 reported by Morten Brekkevold
2
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Won't Fix
Medium
Morten Brekkevold

Bug Description

Several users have reported large amounts of repeated, false coldStart
alerts from their NAV installations.

The affected devices are reported to have their "upsince" time switch back
and forth between the real date and a date in 1970 with each alert.

According to reports, this may have started about two weeks back, which
might indicate some sort of problem with timestamp calculations in
getDeviceData. This has not been confirmed yet; it may also just be a
intermittent bug related to dropped SNMP communication.

[http://sourceforge.net/tracker/index.php?func=detail&aid=1784604&group_id=107608&atid=648170]

Revision history for this message
Asbjornp (asbjornp) wrote :

Originator: NO

While NAV is reporting "insane uptime", the actually uptime has been
confirmed by using snmpwalk (sysUpTime).

Example :
--- snip ---

Date: Wed, 29 Aug 2007 10:13:10 +0200
From: <email address hidden>
To: <email address hidden>
Subject: coldStart of xxx.uio.no (129.240.xxx.yyy)

This is an automatically generated message from NAV:

Boks xxx.uio.no (129.240.xxx.yyy) has performed a coldstart:

Old up since: 1970-01-01 01:00:00
New up since: 2007-08-28 06:18:34.448+02
-------------------------------------------------------------
Date: Wed, 29 Aug 2007 11:12:58 +0200
From: <email address hidden>
To: <email address hidden>
Subject: coldStart of xxx.uio.no (129.240.xxx.yyy)

This is an automatically generated message from NAV:

Boks xxx.uio.no (129.240.xxx.yyy) has performed a coldstart:

Old up since: 2007-08-28 06:18:34.44
New up since: 1970-01-01 01:00:00+01
-------------------------------------------------------------
navbox# snmpwalk -v 2c -c supersecret 129.240.xxx.yyy sysUpTime
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (18535455) 2 days,
3:29:14.55

--- /snip ---

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

Originator: YES

Whatever the problem is, something is definitely wrong with the
calculation of this uptime. AFAIK, the sysUpTime value from SNMP is a 32
bit unsigned integer, which means it cannot represent more than 2^32 ticks.
 With 100 ticks a second, the sysUpTime can only count as far as about 16
months. getDeviceData does have some measures in place to detect
wraparounds in the counter, but the 1970-1-1 (the unix timestamp epoch,
mind you) date is of course impossible.

Revision history for this message
Arne-sf (arne-sf) wrote :

Originator: NO

We also get these false coldStart alerts from our NAV installation. It
seems to be related to DNS; Every time we've had issues with our DNS
servers, where NAV is unable to reverse-lookup the IP of devices, we get
these false alerts. Here is an example (in norwegian):

First we get an alert like this (notice the missing hostname):

Boks xxx.xxx.xxx.2 (xxx.xxx.xxx.2) har gjennomført en coldstart:
Gammel oppe siden: 2008-06-04 06:45:17.196
Ny oppe siden : 1970-01-01 01:00:00+01

After some time (after the DNS server issues have been resolved) we get
the following alert:

Boks xxxxx.domain.no (xxx.xxx.xxx.2) har gjennomført en coldstart:
Gammel oppe siden: 1970-01-01 01:00:00
Ny oppe siden : 2008-06-04 06:42:22.741999+02

Changed in nav:
milestone: v3.2 → none
Changed in nav:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.