collector cores and contrail-alarm-gen/contrail-analytics-api in initializing (Redis-UVE.. connection down) on analytics

Bug #1691002 reported by wenqing liang
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.0
Fix Released
High
Arvind
Trunk
Fix Released
High
Arvind

Bug Description

On r4.0-5 mitaka contrail-ha cluster.

root@server5:~# docker exec -it analytics bash
root@server5(analytics):/# contrail-status
== Contrail Analytics ==
contrail-alarm-gen initializing (Redis-UVE:10.10.0.5:6381[None] connection down)
contrail-analytics-api initializing (Redis-UVE:10.10.0.5:6381[None] connection down)
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active

========Run time service failures=============
/var/crashes/core.contrail-collec.4142.server5.1494885701
/var/crashes/core.contrail-collec.2787.server5.1494884396

cores are copied to /cs-shared/bugs/.

no vizd found in /github-build/R4.0/5/ubuntu-14-04/mitaka/sandbox/build/debug/analytics to run gdb to get the backtrace for the cores.

wenqing liang (wliang)
information type: Proprietary → Public
Jeba Paulaiyan (jebap)
tags: added: blocker sanity
Revision history for this message
Arvind (arvindv) wrote :

(gdb) bt
#0 0x00002ae0cc414c37 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00002ae0cc418028 in __GI_abort () at abort.c:89
#2 0x00000000004bc5ea in rd_kafka_crash ()
#3 0x00000000004ce172 in rd_kafka_buf_grow ()
#4 0x00000000004c5d42 in rd_kafka_broker_thread_main ()
#5 0x00000000004e5919 in _thrd_wrapper_function ()
#6 0x00002ae0cadbb184 in start_thread (arg=0x2ae0e0803700)
    at pthread_create.c:312
#7 0x00002ae0cc4d837d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Revision history for this message
Arvind (arvindv) wrote :

librdkafka version used in the build machines:

16.04:
0.9.1-1
14.04:
0.9.0-0contrail0

These packages are old and the Traceback we see are confined only to library calls. On
cursory research, an upgrade to higher client library(librdkafka 0.9.4 or 0.9.5) is recommended
https://github.com/edenhill/librdkafka/issues/511
https://github.com/edenhill/librdkafka/issues/781

For now we will upgrade the build machines to newer librdkafka builds. If hit again, we would like
to have collector logs, kafka logs.

Revision history for this message
Vinod Nair (vinodnair) wrote :

similar crash seen in Build 6 Mitaka
cores / logs in /cs-shared/bugs/1691002/6

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.