delete controller and re-provisioning stuck at "uninstall_collector"

Bug #1537290 reported by Sarath
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
Critical
Thilak Raj

Bug Description

Shown the problem state setup to Thilak for debugging and looks it is possible zookeeper issue. Please find below,
Thilak using the setup for triaging,

root@a5d11e14:~#
root@a5d11e14:~# contrail-status
== Contrail Control ==
supervisor-control: active
contrail-control active
contrail-control-nodemgr active
contrail-dns active
contrail-named active

== Contrail Analytics ==
supervisor-analytics: active
contrail-alarm-gen initializing (Zookeeper:Zookeeper connection down)
contrail-analytics-api initializing (UvePartitions:UVE-Aggregation connection down)
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector initializing (Zookeeper:Zookeeper connection down)
contrail-topology initializing (Zookeeper:Zookeeper connection down)

== Contrail Config ==
supervisor-config: active
contrail-api:0 initializing (Zookeeper:Zookeeper connection down)
contrail-config-nodemgr active
contrail-device-manager backup
contrail-discovery:0 active
contrail-schema backup
contrail-svc-monitor backup
ifmap active

== Contrail Web UI ==
supervisor-webui: active
contrail-webui active
contrail-webui-middleware active

== Contrail Database ==
contrail-database: active
supervisor-database: active
contrail-database-nodemgr active
kafka active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

root@a5d11e14:~#
root@a5d11e14:~#

root@a5d11e14:~# tail -f /var/log/zookeeper/zookeeper.log
2016-01-22 16:38:32,393 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.94:33841 (no session established for client)
2016-01-22 16:38:32,438 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:59965
2016-01-22 16:38:32,438 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,438 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:59965 (no session established for client)
2016-01-22 16:38:32,458 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:59968
2016-01-22 16:38:32,459 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,459 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:59968 (no session established for client)
2016-01-22 16:38:32,519 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:59980
2016-01-22 16:38:32,519 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,520 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:59980 (no session established for client)
2016-01-22 16:38:32,712 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:60002
2016-01-22 16:38:32,712 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,712 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:60002 (no session established for client)
2016-01-22 16:38:32,715 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.90:52779
2016-01-22 16:38:32,715 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,716 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.90:52779 (no session established for client)
2016-01-22 16:38:32,805 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.90:52795
2016-01-22 16:38:32,805 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,805 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.90:52795 (no session established for client)
2016-01-22 16:38:32,874 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:60049
2016-01-22 16:38:32,874 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,874 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:60049 (no session established for client)
2016-01-22 16:38:32,927 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.94:33900
2016-01-22 16:38:32,927 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,927 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.94:33900 (no session established for client)
2016-01-22 16:38:32,948 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:60058
2016-01-22 16:38:32,948 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,948 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /10.87.143.92:60058 (no session established for client)
2016-01-22 16:38:32,985 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.87.143.92:60064
2016-01-22 16:38:32,985 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-01-22 16:38:32,985 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection f

root@Blaster-nsarath-004:~# dpkg -l | grep contrail
ii contrail-server-manager 3.0-2697 all Contrail Server Manager - Server Package
ii contrail-server-manager-client 3.0-2697 all Contrail Server Manager - Client Package
ii contrail-server-manager-cliff-client 3.0-2697 all Contrail Server Manager Cliff Client Package
ii contrail-server-manager-installer 3.0-2697~juno all Contrail Server Manager Installer Packages - Wrapper Package for Server Manager and Related Debian Packages
ii contrail-server-manager-monitoring 3.0-2697 all Contrail Server-manager Monitoring API Library package
ii contrail-web-core 3.0-2697 amd64 Contrail Systems Web UI
ii contrail-web-server-manager 3.0-2697 amd64 Contrail Systems Web UI Server Manager Feature
ii nodejs 0.10.35-1contrail1 amd64 Node.js event-based server-side javascript engine
ii python-backports.ssl-match-hostname 3.4.0.2-1contrail1 all The ssl.match_hostname() function from Python 3.4
ii python-certifi 1.0.1-1contrail1 all Python SSL Certificates
ii python-consistent-hash 1.0-0contrail1 amd64 <insert up to 60 chars description>
ii python-contrail 3.0-2697 amd64 OpenContrail python-libs
ii python-geventhttpclient 1.1.0-1contrail1 amd64 http client library for gevent
ii python-kazoo 1.3.1-1contrail2 all higher level API to Apache Zookeeper (Python 2)
ii python-pycassa 1.11.0-1contrail2 all Client library for Apache Cassandra
ii python-xmltodict 0.9.0-1contrail1 all Makes working with XML feel like you are working with JSON
root@Blaster-nsarath-004:~#

Sarath (nsarath)
description: updated
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/16846
Submitter: Thilak Raj (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16846
Committed: http://github.org/Juniper/contrail-puppet/commit/7439ea55a2d2f7063ea75831af8d27451f1647a8
Submitter: Zuul
Branch: master

commit 7439ea55a2d2f7063ea75831af8d27451f1647a8
Author: tsurendra <email address hidden>
Date: Wed Feb 3 14:02:52 2016 -0800

Closes-Bug: #1538357
Closes-Bug: #1538298
Closes-Bug: #1537290

config was stuck when adding a node at the begining, as ssl keys were
not re-distributed from the first new node.

controller gets stuck at openstack_started
When a node was delete and re-added,
old keepalived instance was still running.

"uninstall_collector"
Zookeeper was not getting restarted as result of this
below commit
https://github.com/Juniper/contrail-puppet/commit/e31d7a9ce230e2991d1a22cf4ecbd8b7f9ca2f88
Nitish provided a fix and tested it.

Change-Id: I5d77c6dae0fa527deb2331cc184b4f297ad39ff7

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.