Vcenter-only: service svc-monitor failed after reboot of controller

Bug #1607485 reported by Sarath
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
New
Medium
Rudra Rugge
R3.1
New
Medium
Rudra Rugge

Bug Description

With latest build#6, after reboot of controller, i see svc-monitor service failed

It is High-Availability with 3 nodes setup as "Config nodes"

>>> svc-monitor log

07/27/2016 04:08:01 PM [contrail-svc-monitor]: SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = oblocknode04 process_status = [ << module_id = contrail-svc-monitor instance_id = 0 state = Non-Functional connection_infos = [ << type = Zookeeper name = Zookeeper server_addrs = [ 172.16.80.2:2181, 172.16.80.13:2181, 172.16.80.4:2181, ] status = Up description = >>, << type = Collector name = server_addrs = [ , ] status = Down description = none to Idle on EvStart >>, << type = Database name = Cassandra server_addrs = [ 172.16.80.2:9160, 172.16.80.13:9160, 172.16.80.4:9160, ] status = Up description = >>, << type = Discovery name = Collector server_addrs = [ 172.16.80.4:5998, ] status = Initializing description = Subscribe >>, << type = Database name = RabbitMQ server_addrs = [ 172.16.80.2:5672, 172.16.80.13:5672, 172.16.80.4:5672, ] status = Up description = >>, ] description = Collector, Discovery:Collector[Subscribe] connection down >>, ] >>
07/27/2016 04:08:01 PM [contrail-svc-monitor]: SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = oblocknode04 process_status = [ << module_id = contrail-svc-monitor instance_id = 0 state = Non-Functional connection_infos = [ << type = Zookeeper name = Zookeeper server_addrs = [ 172.16.80.2:2181, 172.16.80.13:2181, 172.16.80.4:2181, ] status = Up description = >>, << type = Database name = RabbitMQ server_addrs = [ 172.16.80.2:5672, 172.16.80.13:5672, 172.16.80.4:5672, ] status = Up description = >>, << type = Collector name = server_addrs = [ , ] status = Down description = none to Idle on EvStart >>, << type = Discovery name = Collector server_addrs = [ 172.16.80.4:5998, ] status = Initializing description = Subscribe >>, << type = Database name = Cassandra server_addrs = [ 172.16.80.2:9160, 172.16.80.13:9160, 172.16.80.4:9160, ] status = Up description = >>, << type = ApiServer name = ApiServer server_addrs = [ 172.16.80.4:8082, ] status = Initializing >>, ] description = Collector, Discovery:Collector[Subscribe], ApiServer:ApiServer[None] connection down >>, ] >>
07/27/2016 04:08:04 PM [contrail-svc-monitor]: SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = oblocknode04 process_status = [ << module_id = contrail-svc-monitor instance_id = 0 state = Non-Functional connection_infos = [ << type = Zookeeper name = Zookeeper server_addrs = [ 172.16.80.2:2181, 172.16.80.13:2181, 172.16.80.4:2181, ] status = Up description = >>, << type = Database name = RabbitMQ server_addrs = [ 172.16.80.2:5672, 172.16.80.13:5672, 172.16.80.4:5672, ] status = Up description = >>, << type = Collector name = server_addrs = [ , ] status = Down description = none to Idle on EvStart >>, << type = Discovery name = Collector server_addrs = [ 172.16.80.4:5998, ] status = Down description = Subscribe - Status Code 503 >>, << type = Database name = Cassandra server_addrs = [ 172.16.80.2:9160, 172.16.80.13:9160, 172.16.80.4:9160, ] status = Up description = >>, << type = ApiServer name = ApiServer server_addrs = [ 172.16.80.4:8082, ] status = Initializing >>, ] description = Collector, Discovery:Collector[Subscribe - Status Code 503], ApiServer:ApiServer[None] connection down >>, ] >>
~
~

root@oblocknode04:/var/log/contrail#
root@oblocknode04:/var/log/contrail# contrail-status
== Contrail Control ==
supervisor-control: active
contrail-control active
contrail-control-nodemgr active
contrail-dns active
contrail-named active

== Contrail Analytics ==
supervisor-analytics: active
contrail-alarm-gen active
contrail-analytics-api active
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active

== Contrail Config ==
supervisor-config: active
contrail-api:0 active
contrail-config-nodemgr active
contrail-device-manager active
contrail-discovery:0 active
contrail-schema active
contrail-svc-monitor failed
ifmap active

== Contrail Database ==
contrail-database: active

== Contrail Supervisor Database ==
supervisor-database: active
contrail-database-nodemgr active
kafka active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

root@oblocknode04:/var/log/contrail#

root@oblocknode04:/var/log/contrail#
root@oblocknode04:/var/log/contrail# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-analytics 3.1.0.0-6 6
contrail-config 3.1.0.0-6 6
contrail-config-vmware 3.1.0.0-6 6
contrail-control 3.1.0.0-6 6
contrail-database-common 3.1.0.0-6 6
contrail-dns 3.1.0.0-6 6
contrail-docs 3.1.0.0-6 6
contrail-f5 3.1.0.0-6 6
contrail-fabric-utils 3.1.0.0-6 6
contrail-install-packages 3.1.0.0-6~vcenter 6
 contrail-install-vcenter-plugin 3.1.0.0-07262016 6
contrail-lib 3.1.0.0-6 6
contrail-nodemgr 3.1.0.0-6 6
contrail-openstack-analytics 3.1.0.0-6 6
contrail-openstack-control 3.1.0.0-6 6
contrail-openstack-database 3.1.0.0-6 6
contrail-setup 3.1.0.0-6 6
contrail-utils 3.1.0.0-6 6
  contrail-vmware-config 3.1.0.0-6 6
      ifmap-python-client 0.1-2 6
  ifmap-server 0.3.2-1contrail2 6
  python-contrail 3.1.0.0-6 6
root@oblocknode04:/var/log/contrail#

Revision history for this message
Sarath (nsarath) wrote :

nsarath@ubuntu-build04:/auto/cores/891$ pwd
/auto/cores/891
nsarath@ubuntu-build04:/auto/cores/891$ ls -l
total 1008856
-rwxrwxrwx 1 nsarath test 403988480 Jul 28 11:45 Ctrl-A-log.tar
-rwxrwxrwx 1 nsarath test 287825920 Jul 28 11:45 Ctrl-B-log.tar
-rwxrwxrwx 1 nsarath test 241838080 Jul 28 11:45 Ctrl-C-log.tar
-rwxrwxrwx 1 nsarath test 19077120 Jul 28 11:45 Vrtr-0-log.tar
-rwxrwxrwx 1 nsarath test 19046400 Jul 28 11:45 Vrtr-1-log.tar
-rwxrwxrwx 1 nsarath test 19087360 Jul 28 11:45 Vrtr-3-log.tar
-rwxrwxrwx 1 nsarath test 19046400 Jul 28 11:45 Vrtr-5-log.tar
-rwxrwxrwx 1 nsarath test 19056640 Jul 28 11:45 Vrtr-7-log.tar
nsarath@ubuntu-build04:/auto/cores/891$

Sarath (nsarath)
Changed in juniperopenstack:
importance: Critical → Medium
Revision history for this message
Sarath (nsarath) wrote :

Workaround: re-starting the service recover back state to "backup"

Changed in juniperopenstack:
milestone: r3.1.0.0-fcs → none
Sarath (nsarath)
tags: added: releasenote
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.