Alarmgen stops processing UVE changes

Bug #1592950 reported by Anish Mehta
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
Undecided
Nikhil Bansal
Trunk
Fix Committed
Undecided
Nikhil Bansal
OpenContrail
Fix Committed
Undecided
Nikhil Bansal

Bug Description

This Exception is seen:
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/dist-packages/opserver/alarmgen.py", line 1043, in run_uve_processing
    self.run_alarm_timers(int(curr))
  File "/usr/lib/python2.7/dist-packages/opserver/alarmgen.py", line 915, in run_alarm_timers
    (curr_time, self.tab_alarms)
  File "/usr/lib/python2.7/dist-packages/opserver/alarmgen.py", line 491, in run_timers
    asm = tab_alarms[tab][uv][nm]
KeyError: 'ObjectConfigNode:nodea35'
<Greenlet at 0x7f21cf267190: <bound method Controller.run_uve_processing of <opserver.alarmgen.Controller object at 0x7f21d4b96910>>> failed with KeyError

After this, alarmgen is unable to process any UVE changes.

Tags: analytics
Revision history for this message
Anish Mehta (amehta00) wrote :
Revision history for this message
Anish Mehta (amehta00) wrote :
Revision history for this message
Anish Mehta (amehta00) wrote :
Download full text (12.6 KiB)

SandeshTraceTextResponse
traces
2016-06-15 00:05:22.167 AlarmStateChangeTrace: State change info for alarm: ObjectDatabaseInfo ObjectDatabaseInfo:nodec53 ConfIncorrectDatabase 0 2
2016-06-15 00:05:22.226 AlarmStateChangeTrace: State change info for alarm: ObjectCollectorInfo ObjectCollectorInfo:nodea35 ProcessConnectivity 0 2
2016-06-15 00:05:22.226 AlarmStateChangeTrace: State change info for alarm: ObjectCollectorInfo ObjectCollectorInfo:nodea35 ConfIncorrectAnalytics 0 2
2016-06-15 00:05:22.236 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodec53 ProcessStatus 0 1
2016-06-15 00:05:22.236 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodec53 ProcessConnectivity 0 2
2016-06-15 00:05:22.236 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodec53 ConfIncorrectControl 0 2
2016-06-15 00:05:22.267 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea35 ProcessStatus 0 1
2016-06-15 00:05:22.267 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea35 ConfIncorrectConfig 0 2
2016-06-15 00:05:22.296 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea34 PartialSysinfoConfig 0 2
2016-06-15 00:05:22.296 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea34 ProcessStatus 0 1
2016-06-15 00:05:22.296 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea34 ProcessConnectivity 0 2
2016-06-15 00:05:22.296 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea34 ConfIncorrectConfig 0 2
2016-06-15 00:05:22.323 AlarmStateChangeTrace: State change info for alarm: ObjectCollectorInfo ObjectCollectorInfo:nodec53 ProcessStatus 0 1
2016-06-15 00:05:22.323 AlarmStateChangeTrace: State change info for alarm: ObjectCollectorInfo ObjectCollectorInfo:nodec53 ProcessConnectivity 0 2
2016-06-15 00:05:22.323 AlarmStateChangeTrace: State change info for alarm: ObjectCollectorInfo ObjectCollectorInfo:nodec53 ConfIncorrectAnalytics 0 2
2016-06-15 00:05:22.333 AlarmStateChangeTrace: State change info for alarm: ObjectDatabaseInfo ObjectDatabaseInfo:nodea35 ConfIncorrectDatabase 0 2
2016-06-15 00:05:22.357 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodea34 ProcessConnectivity 0 2
2016-06-15 00:05:22.357 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodea34 ConfIncorrectControl 0 2
2016-06-15 00:05:22.375 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodea35 ProcessConnectivity 0 2
2016-06-15 00:05:22.375 AlarmStateChangeTrace: State change info for alarm: ObjectBgpRouter ObjectBgpRouter:nodea35 ConfIncorrectControl 0 2
2016-06-15 00:05:22.683 AlarmStateChangeTrace: State change info for alarm: ObjectConfigNode ObjectConfigNode:nodea34 ProcessConnectivity 2 0
2016-06-15 00:05:22.688 AlarmStateChangeTrace: State change info for alarm: ObjectVRouter ObjectVRouter:nodec55 PartialSysinfoCompute 0 2
2016-06-15 00:05:22.688 AlarmState...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21199
Submitter: Anish Mehta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/21200
Submitter: Anish Mehta (<email address hidden>)

Revision history for this message
Anish Mehta (amehta00) wrote :

The Alarm Timer code raised an exception, which was not handled.

1. We need to exit in case of exception.
2. This exception should not have happened in the first place. (Nikhil, please take a look)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21200
Committed: http://github.org/Juniper/contrail-controller/commit/4f1f910d5b282baf4ff507303f09c59ac0342373
Submitter: Zuul
Branch: R3.0

commit 4f1f910d5b282baf4ff507303f09c59ac0342373
Author: Anish Mehta <email address hidden>
Date: Wed Jun 15 12:33:34 2016 -0700

Better exception handling in Alarmgen during uve processing and timer processing.
Partial-Bug:1592950

Change-Id: I3a27cc5f906113c641ce015500dea03e7a013382

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21199
Committed: http://github.org/Juniper/contrail-controller/commit/75209fbf7738cbb4b41b4f6af20982e347a10b77
Submitter: Zuul
Branch: master

commit 75209fbf7738cbb4b41b4f6af20982e347a10b77
Author: Anish Mehta <email address hidden>
Date: Wed Jun 15 12:33:34 2016 -0700

Better exception handling in Alarmgen during uve processing and timer processing.
Partial-Bug:1592950

Change-Id: I3a27cc5f906113c641ce015500dea03e7a013382

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/21337
Submitter: Nikhil Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21338
Submitter: Nikhil Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21338
Committed: http://github.org/Juniper/contrail-controller/commit/8ad8beda9406f9f7841d66f513d680dbbcf8f2d6
Submitter: Zuul
Branch: master

commit 8ad8beda9406f9f7841d66f513d680dbbcf8f2d6
Author: Nikhil B <email address hidden>
Date: Wed Jun 22 08:34:39 2016 +0530

Need to check for timer in set_alarm for Idle state

When set_alarm event happens in Idle state, we were not checking if there is
an existing delete timer. This leads to mismatch in timers and alarms.
Closes-Bug: 1592950
Closes-Bug: 1587737

Change-Id: I5c313c06feaba4bfe31ebae08b0563ebe83b1db0

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/21337
Committed: http://github.org/Juniper/contrail-controller/commit/296d2241a0a3b56226b1b6fcdf553038e25e9f0c
Submitter: Zuul
Branch: R3.0

commit 296d2241a0a3b56226b1b6fcdf553038e25e9f0c
Author: Nikhil B <email address hidden>
Date: Wed Jun 22 08:25:34 2016 +0530

Need to check for timer in set_alarm for Idle state

When set_alarm event happens in Idle state, we were not checking if there is
an existing delete timer. This leads to mismatch in timers and alarms.
Closes-Bug: 1592950
Closes-Bug: 1587737

Change-Id: Icabb317f52591cb6b74f484f00aaa2dc9307bcdf

Changed in opencontrail:
status: New → Won't Fix
Changed in opencontrail:
status: Won't Fix → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.