[ Build : 4.1.1.0-104 ][contrail-snmp-collector]: Exception NotFoundException connecting to Config DB

Bug #1755382 reported by Ankit Jain
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.1
Fix Committed
High
Zhiqiang Cui
Trunk
Fix Committed
High
Zhiqiang Cui

Bug Description

The issue was seen in the sanity :
[Build "Ubuntu 16.04.2 LTS" 4.1.1.0-104~ocata] SMLite-Contrail-HA-Ocata-Sanity-Report. Report

Topology :
DISTRO : "Ubuntu 16.04.2 LTS"
SKU : ocata
Config Nodes : [u'nodem16', u'nodem17', u'nodem18']
Control Nodes : [u'nodem16', u'nodem17', u'nodem18']
Compute Nodes : [u'nodem19', u'nodem20', u'nodem5']
Openstack Node : [u'nodem16', u'nodem17', u'nodem18']
WebUI Node : [u'nodem16', u'nodem17', u'nodem18']
Analytics Node! s : [u'nodem16', u'nodem17', u'nodem18']
Database Nodes : [u'nodem16', u'nodem17', u'nodem18']
Physical Devices : [u'blr-mx2', u"'blr-mx2'"]
LB Nodes : [u'nodea10']

contrail-snmp-collector could not come up on one of the analytics nodes.
contrail-topology failed to come up on the other analytics node due to the same issue.

root@nodem18(analytics):/# contrail-status
== Contrail Analytics ==
contrail-collector: active
contrail-analytics-api: active
contrail-query-engine: active
contrail-alarm-gen: active
contrail-snmp-collector: active
contrail-topology: inactive
contrail-analytics-nodemgr: active

root@nodem17:~# docker exec -it analytics contrail-status
== Contrail Analytics ==
contrail-collector: active
contrail-analytics-api: active
contrail-query-engine: active
contrail-alarm-gen: active
contrail-snmp-collector: inactive
contrail-topology: active
contrail-analytics-nodemgr: active
root@nodem17:~#

Contrail-snmp-collector logs:

03/12/2018 01:16:41 PM [contrail-topology]: SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = nodem18 process_status = [ << module_id = contrail-topology instance_id = 0 state = Non-Functional connection_infos = [ << type = Zookeeper name = Zookeeper server_addrs = [ 10.204.216.105:2181, 10.204.216.106:2181, 10.204.216.107:2181, ] status = Up description = >>, << type = Collector name = server_addrs = [ , ] status = Down description = none to Idle on EvStart >>, << type = Database name = Cassandra server_addrs = [ 10.204.216.105:9161, 10.204.216.106:9161, 10.204.216.107:9161, ] status = Initializing description = >>, << type = Database name = RabbitMQ server_addrs = [ 10.204.216.105:5672, 10.204.216.106:5672, 10.204.216.107:5672, ] status = Up description = >>, ] description = Collector, Database:Cassandra[] connection down >>, ] >>
03/12/2018 01:16:41 PM [contrail-topology]: Exception NotFoundException connecting to Config DB. Arguments:
(None,): Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/contrail_topology/config_handler.py", line 47, in start
    credential=cassandra_credential)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_object_db.py", line 20, in __init__
    obj_cache_exclude_types)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_cassandra.py", line 148, in __init__
    self._cassandra_init(server_list)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_cassandra.py", line 553, in _cassandra_init
    self._cassandra_init_conn_pools()
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_cassandra.py", line 638, in _cassandra_init_conn_pools
    **cf_kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/columnfamily.py", line 284, in __init__
    self.load_schema()
  File "/usr/lib/python2.7/dist-packages/pycassa/columnfamily.py", line 312, in load_schema
    raise nfe
NotFoundException: NotFoundException(_message=None, why='Column family obj_uuid_table not found.')

Logs copied here:

/cs-shared/bugs/<bug-id>

Ankit Jain (ankitja)
tags: added: sanity
tags: added: sanityblocker
removed: sanity
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

This traceback would be seen if contrail-snmp-collector tries to connect to config db before contrail-api creates obj_uuid_table. This is a timing issue and should recover upon creation of obj_uuid_table.

https://github.com/Juniper/contrail-controller/blob/R4.1/src/opserver/config_handler.py#L52

replacing exit() with exit(2) should fix the issue

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/40852
Submitter: Zhiqiang Cui (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/40852
Committed: http://github.com/Juniper/contrail-controller/commit/71cabfdf04d03082d81d9a2b4d00f613305967af
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 71cabfdf04d03082d81d9a2b4d00f613305967af
Author: zcui <email address hidden>
Date: Mon Mar 19 23:00:55 2018 -0700

Exception NotFoundException connecting to Config DB.

Timing issue, contrail-snmp-collector tries to connect
config db before contrail-api creates obj_uuid_table.

Change-Id: I4fe1ed805910dab855bb2bb1ccebeb2bee32011d
Closes-Bug: 1755382

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/40883
Submitter: Zhiqiang Cui (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/40883
Committed: http://github.com/Juniper/contrail-analytics/commit/83e25aaa90659152551d5e0c3ab682d160380136
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 83e25aaa90659152551d5e0c3ab682d160380136
Author: zcui <email address hidden>
Date: Tue Mar 20 16:15:41 2018 -0700

Exception NotFoundException connecting to Config DB.

Timing issue, contrail-snmp-collector tries to connect
config db before contrail-api creates obj_uuid_table.
Closes-Bug: 1755382

Change-Id: I67260ca667eb35a148a0c0c148cbf3af1461090f

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.