Contrail database node manager service failing to start

Bug #1588156 reported by venu kolli
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
Critical
Arvind
R3.0
Fix Committed
Critical
Arvind

Bug Description

Build R3.0.2 Build 46

contrail database node manager service is exiting due to the following issue.

 = 8 num_core_per_socket = 4 num_thread_per_core = 2 >> >>
06/01/2016 09:41:39 PM [contrail-database-nodemgr]: Discarding event[EvSandeshUVESend] in state[Idle]
06/01/2016 09:41:39 PM [contrail-database-nodemgr]: Processing event[EvSandeshUVESend] in state[Idle]
06/01/2016 09:41:39 PM [contrail-database-nodemgr]: SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = b4s342 process_status = [ << module_id = contrail-database-nodemgr instance_id = 0 state = Functional description = >>, ] build_info = {"build-info" : [{"build-version" : "3.0.2.0", "build-time" : "2016-06-01 23:34:31.817829", "build-user" : "contrail-builder", "build-hostname" : "contrail-ec-build20.juniper.net", "build-id" : "3.0.2.0-46", "build-number" : "46"}]} >>
06/01/2016 09:41:39 PM [contrail-database-nodemgr]: Discarding event[EvSandeshUVESend] in state[Idle]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Starting Introspect on HTTP Port 8103
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Cannot write http_port 8103 to /tmp/contrail-database-nodemgr.9270.http_port
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Received discovery update [{u'partcount': u'{ "1":[0,1], "2":[1,2], "3":[3,4], "4":[7,4], "5":[11,4]}', u'@publisher-id': u'b4s342', u'pid': u'21351', u'ip-address': u'10.84.24.32', u'redis-gen': u'1', u'port': u'8086'}, {u'partcount': u'{ "1":[0,1], "2":[1,2], "3":[3,4], "4":[7,4], "5":[11,4]}', u'@publisher-id': u'b4s343', u'pid': u'28180', u'ip-address': u'10.84.24.33', u'redis-gen': u'1', u'port': u'8086'}] for collector service
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Processing event[EvCollectorChange] in state[Idle]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Session Event: TCP Connected
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Sandesh Client: Event[EvCollectorChange] => State[Idle] -> State[Connect]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Processing event[EvTcpConnected] in state[Connect]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Sandesh Client: Event[EvTcpConnected] => State[Connect] -> State[ClientInit]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Processing event[EvSandeshUVESend] in state[ClientInit]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Processing event[EvSandeshCtrlMessageRecv] in state[ClientInit]
06/01/2016 09:41:40 PM [contrail-database-nodemgr]: Sandesh Client: Event[EvSandeshCtrlMessageRecv] => State[ClientInit] -> State[Established]
wokeup and found a line
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/dist-packages/nodemgr/database_nodemgr/database_event_manager.py", line 341, in runforever
    self.database_periodic()
  File "/usr/lib/python2.7/dist-packages/nodemgr/database_nodemgr/database_event_manager.py", line 246, in database_periodic
    self.send_database_status()
  File "/usr/lib/python2.7/dist-packages/nodemgr/database_nodemgr/database_event_manager.py", line 265, in send_database_status
    self.get_pending_compaction_count(op)
AttributeError: 'NoneType' object has no attribute 'pending_compaction_tasks'
<Greenlet at 0x7fa3c5859d70: <bound method DatabaseEventManager.runforever of <nodemgr.database_nodemgr.database_event_manager.DatabaseEventManager object at 0x7fa3c637cf10>>> failed with AttributeError

venu kolli (vkolli)
Changed in juniperopenstack:
assignee: nobody → Arvind (arvindv)
importance: Undecided → Critical
Jeba Paulaiyan (jebap)
tags: added: analytics blocker
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20894
Submitter: Arvind (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20894
Committed: http://github.org/Juniper/contrail-controller/commit/49f6637149803e8020084dff9b47359eb0e6f6f0
Submitter: Zuul
Branch: R3.0

commit 49f6637149803e8020084dff9b47359eb0e6f6f0
Author: arvindvis <email address hidden>
Date: Thu Jun 2 11:38:33 2016 -0700

send_database_status uses the inner structure
cassandra_compaction_task without initializing,
the fix takes care of that.
Closes-Bug: #1588156

Change-Id: Ied936db6ffa6b23c86d4f6a76680d9a550fee89f

Revision history for this message
Raj Reddy (rajreddy) wrote :

Reviewed: https://review.opencontrail.org/20999
Committed: http://github.org/Juniper/contrail-controller/commit/214d61eb4e4eda19ba3bbd336e6f3c801c6f4e5d
Submitter: Zuul
Branch: master

commit 214d61eb4e4eda19ba3bbd336e6f3c801c6f4e5d
Author: arvindvis <email address hidden>
Date: Thu May 26 18:17:08 2016 -0700

This commit adds reporting of cassandra thread pool stats
for monitoring. It collects the nodestatus output and
reports it every min.
Closes-Bug: 1583733,1588156,1589039

(cherry picked from commit eda32030333f80247e3195d23aeb8449325d6658)

Conflicts:
 src/nodemgr/database_nodemgr/database_event_manager.py

send_database_status uses the inner structure
cassandra_compaction_task without initializing,
the fix takes care of that.

(cherry picked from commit 49f6637149803e8020084dff9b47359eb0e6f6f0)

The output of 'nodetool compactiontasks' can contain multiple lines
and hence we should grep for line having 'pending tasks' before
getting the value..

(cherry picked from commit f652617414b37534f9329bf0e0c94ad37262d6af)

Change-Id: I78993df71a32295e9697768e4b64592e6c4c9405

Changed in juniperopenstack:
status: New → Fix Committed
milestone: none → r3.1.0.0-fcs
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.