SM:mainline:build3039:collector core in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

Bug #1664923 reported by sundarkh
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Incomplete
Critical
sundarkh

Bug Description

SM:mainline:build3039:collector core in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

1) Install SM mainline 3039 kilo ; Reimage the HA target with ubuntu 14-4-5 successfully
2) Provision the cluster with mainline 3039 kilo; Provision gets completed, but collector core happens, leading to failure of collector, contrail-nodemanager

Program terminated with signal SIGABRT, Aborted.
#0 0x00007f1293edfc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f1293edfc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f1293ee3028 in __GI_abort () at abort.c:89
#2 0x00007f1293ed8bf6 in __assert_fail_base (fmt=0x7f12940293b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x82b41e "ret",
    file=file@entry=0x82b708 "controller/src/analytics/OpServerProxy.cc", line=line@entry=832,
    function=function@entry=0x82ba00 "virtual bool OpServerProxy::DeleteUVEs(const string&, const string&, const string&, const string&)") at assert.c:92
#3 0x00007f1293ed8ca2 in __GI___assert_fail (assertion=0x82b41e "ret", file=0x82b708 "controller/src/analytics/OpServerProxy.cc", line=832,
    function=0x82ba00 "virtual bool OpServerProxy::DeleteUVEs(const string&, const string&, const string&, const string&)") at assert.c:101
#4 0x0000000000617698 in ?? ()
#5 0x00000000005aae61 in ?? ()
#6 0x000000000059992d in ?? ()
#7 0x000000000075e260 in ?? ()
#8 0x000000000075cf82 in ?? ()
#9 0x000000000075d735 in ?? ()
#10 0x000000000075b20b in ?? ()
#11 0x0000000000753a95 in ?? ()
#12 0x000000000075a635 in ?? ()
#13 0x000000000046ff2f in ?? ()
#14 0x00007f1295467b3a in ?? () from /usr/lib/libtbb.so.2
#15 0x00007f1295463816 in ?? () from /usr/lib/libtbb.so.2
#16 0x00007f1295462f4b in ?? () from /usr/lib/libtbb.so.2
#17 0x00007f129545f0ff in ?? () from /usr/lib/libtbb.so.2
#18 0x00007f129545f2f9 in ?? () from /usr/lib/libtbb.so.2
#19 0x00007f1295683184 in start_thread (arg=0x7f1264bf2700) at pthread_create.c:312
#20 0x00007f1293fa337d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) bt full
#0 0x00007f1293edfc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        resultvar = 0
        pid = 3677
        selftid = 3751
#1 0x00007f1293ee3028 in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x7ffffe413e99, sa_sigaction = 0x7ffffe413e99}, sa_mask = {__val = {139717769320708, 8566536, 832, 4294967295, 139717767963987,
              4294967296, 139716976377312, 0, 0, 0, 0, 0, 0, 21474836480, 139717820645376, 139717769335736}}, sa_flags = 8565790, sa_restorer = 0x82ba00}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f1293ed8bf6 in __assert_fail_base (fmt=0x7f12940293b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x82b41e "ret",
    file=file@entry=0x82b708 "controller/src/analytics/OpServerProxy.cc", line=line@entry=832,
    function=function@entry=0x82ba00 "virtual bool OpServerProxy::DeleteUVEs(const string&, const string&, const string&, const string&)") at assert.c:92
        str = 0x7f1218008500 "0\206"
        total = 4096
#3 0x00007f1293ed8ca2 in __GI___assert_fail (assertion=0x82b41e "ret", file=0x82b708 "controller/src/analytics/OpServerProxy.cc", line=832,
    function=0x82ba00 "virtual bool OpServerProxy::DeleteUVEs(const string&, const string&, const string&, const string&)") at assert.c:101
No locals.
#4 0x0000000000617698 in ?? ()
No symbol table info available.
#5 0x00000000005aae61 in ?? ()
No symbol table info available.
#6 0x000000000059992d in ?? ()
No symbol table info available.
#7 0x000000000075e260 in ?? ()
No symbol table info available.
#8 0x000000000075cf82 in ?? ()
No symbol table info available.
#9 0x000000000075d735 in ?? ()
No symbol table info available.
#10 0x000000000075b20b in ?? ()
No symbol table info available.
#11 0x0000000000753a95 in ?? ()
No symbol table info available.
#12 0x000000000075a635 in ?? ()
No symbol table info available.
#13 0x000000000046ff2f in ?? ()
No symbol table info available.
#14 0x00007f1295467b3a in ?? () from /usr/lib/libtbb.so.2
No symbol table info available.
#15 0x00007f1295463816 in ?? () from /usr/lib/libtbb.so.2
No symbol table info available.
#16 0x00007f1295462f4b in ?? () from /usr/lib/libtbb.so.2
No symbol table info available.
#17 0x00007f129545f0ff in ?? () from /usr/lib/libtbb.so.2
No symbol table info available.
#18 0x00007f129545f2f9 in ?? () from /usr/lib/libtbb.so.2
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
#19 0x00007f1295683184 in start_thread (arg=0x7f1264bf2700) at pthread_create.c:312
        __res = <optimized out>
        pd = 0x7f1264bf2700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139716976387840, 3660175377794048592, 0, 0, 139716976388544, 139716976387840, -3679643878949150128, -3679822643066723760},
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#20 0x00007f1293fa337d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.
(gdb)

3) Topology

server-manager-client display server --select id,cluster_id,roles,ip_address
+---------+----------------+----------------+----------------------------------------------------------------------------+
| id | cluster_id | ip_address | roles |
+---------+----------------+----------------+----------------------------------------------------------------------------+
| nodec28 | cluster5sanity | 10.204.217.13 | [u'compute'] |
| nodeg37 | cluster5sanity | 10.204.217.77 | [u'compute'] |
| nodec10 | cluster5sanity | 10.204.217.176 | [u'compute'] |
| nodei17 | cluster5sanity | 10.204.217.129 | [u'config', u'control', u'collector', u'database', u'webui', u'openstack'] |
| nodei19 | cluster5sanity | 10.204.217.131 | [u'config', u'control', u'collector', u'database', u'webui', u'openstack'] |
| nodei20 | cluster5sanity | 10.204.217.132 | [u'config', u'control', u'collector', u'database', u'webui', u'openstack'] |
+---------+----------------+----------------+----------------------------------------------------------------------------+
root@nodej3:~#

NOtes
-----

1) Collector core happened in nodei17

2) contrail-analytics-nodemgr initializing (Collector connection down)

3) restart of the process contrail-analytics-nodemgr brings back the process to active

Jeba Paulaiyan (jebap)
tags: added: analytics
Revision history for this message
Raj Reddy (rajreddy) wrote :

Sundar
Is the setup available in the same state?
We would need /var/log/redis-server.log to look into this, it would be great if you can leave the setup in this state to debug too.

thanks,

Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

Fix available in build >= #3048

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.