offMaintenance alerts for same device every 5 minutes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Network Administration Visualized |
Fix Released
|
High
|
Morten Brekkevold |
Bug Description
NAV 4.2.1
A maintenance task ended 2014-12-12 20:44. Since then an alert has been generated every 5 minutes telling that the device is no longer on maintenance.
The mainenance task is not listed as Active, but is found in the archive with
Start 2012-12-05 20:44
End 2014-12-12 20:44
...
State Passed
I have tried to cancel the task to see if I triggered something, but the only change now is that state equals Canceled.
Under "Recent alerts" for the box I find a maintenanceState with "Unresolved" end-time
After turning on debug for maintengine we get this in the log every 5 minutes:
[2014-12-17 08:30:00,607] [DEBUG] [pid=10229 nav.maintengine] -------
[2014-12-17 08:30:00,641] [DEBUG] [pid=10229 nav.maintengine] Endless maintenance task 347: Things that haven't been up longer than the threshold: [<Netbox: kulthLscene-
[2014-12-17 08:30:00,647] [DEBUG] [pid=10229 nav.maintengine] Endless maintenance task 525: Things that haven't been up longer than the threshold: [<Netbox: sval-afscon3-
[2014-12-17 08:30:00,652] [DEBUG] [pid=10229 nav.maintengine] Endless maintenance task 560: Things that haven't been up longer than the threshold: [<Netbox: m2m-host-
[2014-12-17 08:30:00,655] [DEBUG] [pid=10229 nav.maintengine] Tasks transitioned to passed state: []
[2014-12-17 08:30:00,658] [DEBUG] [pid=10229 nav.maintengine] Tasks transitioned to active state: []
[2014-12-17 08:30:00,681] [DEBUG] [pid=10229 nav.maintengine] Subjects that should be on maintenance but wasn't: set([])
[2014-12-17 08:30:00,681] [DEBUG] [pid=10229 nav.maintengine] Subjects that should not be on maintenance but was: set([<Netbox: kulthverksted-
[2014-12-17 08:30:00,696] [DEBUG] [pid=10229 nav.maintengine] Event posted: <EventQueue: event_type_
[2014-12-17 08:30:00,719] [DEBUG] [pid=10229 nav.maintengine] Finished in 0.062s
[2014-12-17 08:30:00,719] [DEBUG] [pid=10229 nav.maintengine] -------
Changed in nav: | |
importance: | Undecided → High |
status: | Confirmed → In Progress |
Changed in nav: | |
status: | Fix Committed → Fix Released |
Changed in nav: | |
milestone: | 4.2.4 → 4.2.5 |
Has the kulthverksted- sw.infra device been physically replaced during the maintenance period? Maintengine seems to work as it should, but it may be that the eventengine cannot match the offMaintenance alert to the onMaintenance alert becayse the netbox has switched device id's. Check the eventengine logs...
also, you can forcibly close the maintenance alert from the new Status page (check the "on maintenace" filter checkbox to see these alerts).