analytics containers are keeps restarting with ocata-5.0-80 build

Bug #1774474 reported by Venkatesh Velpula on 2018-05-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R5.0
Invalid
Critical
Venkatesh Velpula
Trunk
Invalid
Critical
Venkatesh Velpula

Bug Description

Below analytics containers are keeps restarting
  contrail-analytics-alarm-gen:ocata-5.0-80
  contrail-analytics-api:ocata-5.0-80
  contrail-analytics-snmp-collector:ocata-5.0-80
  contrail-analytics-topology:ocata-5.0-80

With same latest ansible deployer code,on build79,we don't see this issue.

build :ocata-5.0-80
deployer :ansible-deployer
hostos :centos7.5
regitry :10.204.217.152:5000

setup:
controller/master :nodec19,nodec20,nodec21(HA with multi innterface)
compute/minion . :nodei16,nodei18

docker log snippet
===================

    + mv /etc/contrail/vnc_api_lib.ini.tmp /etc/contrail/vnc_api_lib.ini
+ exec /usr/bin/python /usr/bin/contrail-analytics-api -c /etc/contrail/contrail-analytics-api.conf -c /etc/contrail/contrail-keystone-auth.conf
-c /etc/contrail/contrail-analytics-api.conf -c /etc/contrail/contrail-keystone-auth.conf
05/31/2018 08:00:32 PM [contrail-analytics-api]: SANDESH: CONNECT TO COLLECTOR: True
05/31/2018 08:00:32 PM [contrail-analytics-api]: SANDESH: Logging: LEVEL: [SYS_INFO] -> [SYS_DEBUG]
05/31/2018 08:00:32 PM [contrail-analytics-api]: SANDESH: Logging: FILE: [None] -> [/var/log/contrail/contrail-analytics-api.log]
Traceback (most recent call last):
  File "/usr/bin/contrail-analytics-api", line 9, in <module>
    load_entry_point('opserver==0.1dev', 'console_scripts', 'contrail-analytics-api')()
  File "/usr/lib/python2.7/site-packages/opserver/opserver.py", line 2638, in main
    opserver = OpServer(args_str)
  File "/usr/lib/python2.7/site-packages/opserver/opserver.py", line 702, in __init__
    staticmethod(ConnectionState.get_process_state_cb),
AttributeError: type object 'ConnectionState' has no attribute 'get_process_state_cb'

++++ echo 10.204.217.98
+++ local server_address=10.204.217.98
+++ extended_server_list+='10.204.217.98:9092 '
+++ '[' -n '10.204.217.52:9092 10.204.217.71:9092 10.204.217.98:9092 ' ']'
+++ echo '10.204.217.52:9092 10.204.217.71:9092 10.204.217.98:9092'
++ KAFKA_SERVERS='10.204.217.52:9092 10.204.217.71:9092 10.204.217.98:9092'
++ ANALYTICS_API_VIP=
++ CONFIG_API_VIP=
++ LINKLOCAL_SERVICE_PORT=80
++ LINKLOCAL_SERVICE_NAME=metadata
++ LINKLOCAL_SERVICE_IP=169.254.169.254
++ IPFABRIC_SERVICE_PORT=8775
++ IPFABRIC_SERVICE_HOST=
++ XMPP_SSL_ENABLE=False
++ XMPP_SERVER_CERTFILE=/etc/contrail/ssl/certs/server.pem
++ XMPP_SERVER_KEYFILE=/etc/contrail/ssl/private/server-privkey.pem
++ XMPP_SERVER_CA_CERTFILE=/etc/contrail/ssl/certs/ca-cert.pem
++ INTROSPECT_SSL_ENABLE=False
++ INTROSPECT_SSL_INSECURE=True
++ INTROSPECT_CERTFILE=/etc/contrail/ssl/certs/server.pem
++ INTROSPECT_KEYFILE=/etc/contrail/ssl/private/server-privkey.pem
++ INTROSPECT_CA_CERTFILE=/etc/contrail/ssl/certs/ca-cert.pem
++ SANDESH_SSL_ENABLE=False
++ SANDESH_CERTFILE=/etc/contrail/ssl/certs/server.pem
++ SANDESH_KEYFILE=/etc/contrail/ssl/private/server-privkey.pem
++ SANDESH_CA_CERTFILE=/etc/contrail/ssl/certs/ca-cert.pem
++ is_enabled False
++ local val=false
++ [[ false == \t\r\u\e ]]
++ [[ false == \y\e\s ]]
++ [[ false == \e\n\a\b\l\e\d ]]
++ read -r -d '' sandesh_client_config
++ true
++ is_enabled False
++ local val=false
++ [[ false == \t\r\u\e ]]
++ [[ false == \y\e\s ]]
++ [[ false == \e\n\a\b\l\e\d ]]
++ xmpp_certs_config=
++ METADATA_SSL_ENABLE=false
++ METADATA_SSL_CERTFILE=
++ METADATA_SSL_KEYFILE=
++ METADATA_SSL_CA_CERTFILE=
++ METADATA_SSL_CERT_TYPE=
++ RABBITMQ_VHOST=/
++ RABBITMQ_USER=guest
++ RABBITMQ_PASSWORD=guest
++ RABBITMQ_USE_SSL=False
++ RABBITMQ_SSL_VER=
++ RABBITMQ_CLIENT_SSL_CERTFILE=/etc/contrail/ssl/certs/server.pem
++ RABBITMQ_CLIENT_SSL_KEYFILE=/etc/contrail/ssl/private/server-privkey.pem
++ RABBITMQ_CLIENT_SSL_CACERTFILE=/etc/contrail/ssl/certs/ca-cert.pem
++ read -r -d '' rabbitmq_config
++ true
++ read -r -d '' rabbit_config
++ true
++ is_enabled False
++ local val=false
++ [[ false == \t\r\u\e ]]
++ [[ false == \y\e\s ]]
++ [[ false == \e\n\a\b\l\e\d ]]
++ AGENT_MODE=kernel
++ DPDK_UIO_DRIVER=uio_pci_generic
++ CPU_CORE_MASK=0x01
++ HUGE_PAGES=1024
++ HUGE_PAGES_DIR=/dev/hugepages
++ DPDK_MEM_PER_SOCKET=1024
++ VHOST_CONFIG_DIR=/etc/sysconfig/network-scripts
++ TSN_EVPN_MODE=False
++ VROUTER_CRYPT_INTERFACE=crypt0
++ VROUTER_DECRYPT_INTERFACE=decrypt0
++ VROUTER_DECRYPT_KEY=15
++ DIST_SNAT_PROTO_PORT_LIST=
++ FLOW_EXPORT_RATE=0
++ FABRIC_SNAT_HASH_TABLE_SIZE=4096
++ PRIORITY_ID=
++ PRIORITY_BANDWIDTH=
++ PRIORITY_SCHEDULING=
++ QOS_QUEUE_ID=
++ QOS_LOGICAL_QUEUES=
++ QOS_DEF_HW_QUEUE=False
++ PRIORITY_TAGGING=True
+ pre_start_init
+ wait_certs_if_ssl_enabled
+ is_ssl_enabled
+ is_enabled False
+ local val=false
+ [[ false == \t\r\u\e ]]
+ [[ false == \y\e\s ]]
+ [[ false == \e\n\a\b\l\e\d ]]
+ is_enabled False
+ local val=false
+ [[ false == \t\r\u\e ]]
+ [[ false == \y\e\s ]]
+ [[ false == \e\n\a\b\l\e\d ]]
+ is_enabled False
+ local val=false
+ [[ false == \t\r\u\e ]]
+ [[ false == \y\e\s ]]
+ [[ false == \e\n\a\b\l\e\d ]]
+ is_enabled False
+ local val=false
+ [[ false == \t\r\u\e ]]
+ [[ false == \y\e\s ]]
+ [[ false == \e\n\a\b\l\e\d ]]
+ return
++ get_listen_ip_for_node ANALYTICS
+++ find_my_ip_and_order_for_node ANALYTICS
+++ cut -d ' ' -f 1
+++ local server_typ=ANALYTICS_NODES
+++ local server_list=
+++ IFS=,
+++ read -ra server_list
++++ get_local_ips
++++ tr '\n' ,
++++ cat /proc/net/fib_trie
++++ awk '/32 host/ { print f } {f=$2}'
+++ local local_ips=,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,,
+++ local ord=1
+++ for server in '"${server_list[@]}"'
+++ [[ ,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,, =~ ,10\.204\.217\.52, ]]
+++ (( ord+=1 ))
+++ for server in '"${server_list[@]}"'
+++ [[ ,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,10.204.217.71,77.77.1.30,127.0.0.1,172.17.0.1,, =~ ,10\.204\.217\.71, ]]
+++ echo 10.204.217.71 2
+++ return
++ local ip=10.204.217.71
++ [[ -z 10.204.217.71 ]]
++ echo 10.204.217.71
+ host_ip=10.204.217.71
+ cat
+ add_ini_params_from_env ANALYTICS_API /etc/contrail/contrail-analytics-api.conf
+ local service_name=ANALYTICS_API
+ local cfg_path=/etc/contrail/contrail-analytics-api.conf
+ local delim=__
++ grep '^ANALYTICS_API__.*__.*=.*$'
++ sort
++ cut -d = -f 1
++ set -o posix
++ set
++ sed 's/^ANALYTICS_API__//g'
+ local vars=
+ local section=
+ set_third_party_auth_config
+ [[ noauth != \k\e\y\s\t\o\n\e ]]
+ return
+ set_vnc_api_lib_ini
+ local tmp_file=/etc/contrail/vnc_api_lib.ini.tmp
+ cat
+ [[ noauth == \k\e\y\s\t\o\n\e ]]

[root@nodeg31 ~]# contrail-status
Pod Service Original Name State Status
analytics alarm-gen contrail-analytics-alarm-gen restarting Restarting (1) 2 minutes ago
analytics api contrail-analytics-api restarting Restarting (1) 2 minutes ago
analytics collector contrail-analytics-collector running Up 2 hours
analytics nodemgr contrail-nodemgr running Up 2 hours
analytics query-engine contrail-analytics-query-engine running Up 2 hours
analytics snmp-collector contrail-analytics-snmp-collector restarting Restarting (1) 44 minutes ago
analytics topology contrail-analytics-topology restarting Restarting (1) 44 minutes ago
config api contrail-controller-config-api running Up 2 hours
config cassandra contrail-external-cassandra running Up 2 hours
config device-manager contrail-controller-config-devicemgr running Up 2 hours
config nodemgr contrail-nodemgr running Up About a minute
config rabbitmq contrail-external-rabbitmq running Up 2 hours
config schema contrail-controller-config-schema running Up 2 hours
config svc-monitor contrail-controller-config-svcmonitor running Up 2 hours
config zookeeper contrail-external-zookeeper running Up 2 hours
control control contrail-controller-control-control running Up 2 hours
control dns contrail-controller-control-dns running Up 2 hours
control named contrail-controller-control-named running Up 2 hours
control nodemgr contrail-nodemgr running Up 2 hours
database cassandra contrail-external-cassandra running Up 2 hours
database kafka contrail-external-kafka running Up 2 hours
database nodemgr contrail-nodemgr running Up 2 hours
database zookeeper contrail-external-zookeeper running Up 2 hours
kubernetes kube-manager contrail-kubernetes-kube-manager running Up 2 hours
webui job contrail-controller-webui-job running Up 2 hours
webui web contrail-controller-webui-web running Up 2 hours

== Contrail control ==
control: active
nodemgr: active
named: active
dns: active

== Contrail kubernetes ==
kube-manager: active

== Contrail database ==
kafka: active
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
cassandra: active

== Contrail analytics ==
snmp-collector: inactive
query-engine: active
api: inactive<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
alarm-gen: inactive
nodemgr: active
collector: active
topology: inactive

== Contrail webui ==
web: active
job: active

== Contrail config ==
api: active
zookeeper: active
svc-monitor: backup
nodemgr: initializing
device-manager: backup
cassandra: active
rabbitmq: active
schema: backup

[root@nodeg31 ~]#

Venkatesh Velpula (vvelpula) wrote :

adding instances.yaml file

description: updated
Jeba Paulaiyan (jebap) on 2018-05-31
tags: added: beta-blocker fabric
description: updated
Sundaresan Rajangam (srajanga) wrote :

Not all commits from https://bugs.launchpad.net/juniperopenstack/+bug/1763550 made it to build #80 and hence the issue. Please verify in build #81

Venkatesh Velpula (vvelpula) wrote :

this works with latest images

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments