Build 2697: On multi node setup, when zookeeper on one of the db nodes goes down, contrail-topology/snmp collector not able to connect to remote zookeeper

Bug #1533597 reported by Ankit Jain
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
High
ted ghose

Bug Description

Local node: zookeeper stopped/down as shown below:

root@nodeg20:/etc/contrail# service zookeeper status
zookeeper stop/waiting <----------Local zookeeper down
root@nodeg20:/etc/contrail#

root@nodeg20:/etc/contrail# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-vrouter-agent active
contrail-vrouter-nodemgr active

== Contrail Control ==
supervisor-control: active
contrail-control active
contrail-control-nodemgr active
contrail-dns active
contrail-named active

== Contrail Analytics ==
supervisor-analytics: active
contrail-alarm-gen active
contrail-analytics-api active
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector initializing (Zookeeper:Zookeeper connection down) <----not able to connect to remote zookeeper
contrail-topology initializing (Zookeeper:Zookeeper connection down) <---- not able to connect to remote zookeeper

== Contrail Config ==
supervisor-config: active
contrail-api:0 active
contrail-config-nodemgr active
contrail-device-manager backup
contrail-discovery:0 active
contrail-schema backup
contrail-svc-monitor EXITED
ifmap active

== Contrail Database ==
contrail-database: active
supervisor-database: active
contrail-database-nodemgr active
kafka active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

Remote node :

root@nodeg13:~# contrail-status
== Contrail Analytics ==
supervisor-analytics: active
contrail-alarm-gen active
contrail-analytics-api active
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active

== Contrail Config ==
supervisor-config: active
contrail-api:0 active
contrail-config-nodemgr active
contrail-device-manager active
contrail-discovery:0 active
contrail-schema active
contrail-svc-monitor active
ifmap active

== Contrail Web UI ==
supervisor-webui: active
contrail-webui active
contrail-webui-middleware active

== Contrail Database ==
contrail-database: active
supervisor-database: active
contrail-database-nodemgr active
kafka active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

root@nodeg13:~# service zookeeper status
zookeeper start/running, process 15108 <----remote zookeeper is up
root@nodeg13:~#

testbed:

host1 = 'root@10.204.217.53'
host2 = 'root@10.204.217.60'
host3 = 'root@10.204.216.17'
    'cfgm': [host1,host2,host3],
    'webui': [host1],
    'openstack': [host1],
    'control': [host2, host3],
    'collector': [host1, host2, host3],
    'database': [host1, host2, host3],
    'compute': [host2, host3],

Tags: analytics
Ankit Jain (ankitja)
Changed in juniperopenstack:
milestone: none → r3.0-fcs
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

    def fixup_contrail_topology(self):
         conf_fl = '/etc/contrail/contrail-topology.conf'
         with settings(warn_only=True):
             local("[ -f %s ] || > %s" % (conf_fl, conf_fl))
         self.set_config(conf_fl, 'DEFAULTS', 'zookeeper',
                         self.cassandra_server_list[0][0] + ':2181') <<<< Only the first node in the cassandra_server_list [database-nodes] is provisioned for zookeeper in contrail-topology and contrail-snmp-collector

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/16325
Submitter: ted ghose (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16325
Committed: http://github.org/Juniper/contrail-provisioning/commit/d31d409654cd5b23ad6edd2ac75c470bac8420e0
Submitter: Zuul
Branch: master

commit d31d409654cd5b23ad6edd2ac75c470bac8420e0
Author: Ted Ghose <email address hidden>
Date: Fri Jan 15 12:06:46 2016 -0800

provitioning multiple zookeeper 4 snmp & topology

Build 2697: On multi node setup, when zookeeper on one of the db
nodes goes down, contrail-topology/snmp collector not able to
connect to remote zookeeper
Closes-Bug: 1533597

Change-Id: I8045ee5aacbf4579d027baf2c0603a1f97b8b72c

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.