Build 2686 : Collector crashed at impl::GetCassTableClusteringKeyCount

Bug #1528146 reported by Ankit Jain
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Megh Bhatt
Trunk
Fix Committed
High
Megh Bhatt

Bug Description

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f7d37044cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f7d37044cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f7d370480d8 in __GI_abort () at abort.c:89
#2 0x00007f7d3703db86 in __assert_fail_base (fmt=0x7f7d3718e830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x719438 "impl::GetCassTableClusteringKeyCount(session_.get(), keyspace_, table, &ck_count)",
    file=file@entry=0x719380 "controller/src/database/cassandra/cql/cql_if.cc", line=line@entry=879,
    function=function@entry=0x71a5c0 "bool cass::cql::CqlIf::CqlIfImpl::IsTableStatic(const string&)") at assert.c:92
#3 0x00007f7d3703dc32 in __GI___assert_fail (assertion=0x719438 "impl::GetCassTableClusteringKeyCount(session_.get(), keyspace_, table, &ck_count)",
    file=0x719380 "controller/src/database/cassandra/cql/cql_if.cc", line=879, function=0x71a5c0 "bool cass::cql::CqlIf::CqlIfImpl::IsTableStatic(const string&)") at assert.c:101
#4 0x0000000000416600 in ?? ()
#5 0x0000000000560e08 in ?? ()
#6 0x000000000055ea09 in ?? ()
#7 0x000000000052e528 in ?? ()
#8 0x0000000000530018 in ?? ()
#9 0x000000000053125f in ?? ()
#10 0x00000000005337e9 in ?? ()
#11 0x00000000004c39ad in ?? ()
#12 0x00000000004e6946 in ?? ()
#13 0x00000000004cf2af in ?? ()
#14 0x0000000000648020 in ?? ()
#15 0x00000000006451ec in ?? ()
#16 0x0000000000644bdb in ?? ()
#17 0x000000000063cea5 in ?? ()
#18 0x0000000000643d87 in ?? ()
#19 0x00000000006abdc0 in ?? ()
#20 0x00007f7d385cfb3a in ?? () from /usr/lib/libtbb.so.2
#21 0x00007f7d385cb816 in ?? () from /usr/lib/libtbb.so.2
#22 0x00007f7d385caf4b in ?? () from /usr/lib/libtbb.so.2
#23 0x00007f7d385c70ff in ?? () from /usr/lib/libtbb.so.2
#24 0x00007f7d385c72f9 in ?? () from /usr/lib/libtbb.so.2
#25 0x00007f7d387eb182 in start_thread (arg=0x7f7d2eff8700) at pthread_create.c:312
#26 0x00007f7d3710847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Core copied @ /cs-shared/bugs/<bug-id> on any blr shell server (ex nodeb6)

Tags: analytics
Ankit Jain (ankitja)
tags: added: analytics
Ankit Jain (ankitja)
description: updated
Revision history for this message
Megh Bhatt (meghb) wrote :

kilo 14.04

Revision history for this message
Megh Bhatt (meghb) wrote :
Download full text (81.4 KiB)

Two cores here:
1. vizd

Issue is that Db_SetInitDone() is not implemented in CqlIf and hence generator can connect and issue Db_AddColumn which will not find table meta and core

root@a7s9:~# gdb /var/tmp/vizd /var/tmp/core.contrail-collec.2089.nodeg20.1450683984
GNU gdb (Ubuntu 7.7-0ubuntu3) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /var/tmp/vizd...done.

warning: core file may not match specified executable file.
[New LWP 2890]
[New LWP 3940]
[New LWP 2228]
[New LWP 3941]
[New LWP 2891]
[New LWP 2089]
[New LWP 2884]
[New LWP 2885]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f7d37044cc9 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007f7d37044cc9 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f7d370480d8 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f7d3703db86 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f7d3703dc32 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000416600 in cass::cql::CqlIf::CqlIfImpl::IsTableStatic (table=..., this=<optimized out>) at controller/src/database/cassandra/cql/cql_if.cc:878
#5 0x0000000000560e08 in IsTableStatic (table=..., this=<optimized out>) at /usr/include/c++/4.8/bits/basic_string.h:539
#6 InsertIntoTableSync (consistency=CASS_CONSISTENCY_ONE, v_columns=..., this=<optimized out>) at controller/src/database/cassandra/cql/cql_if.cc:893
#7 cass::cql::CqlIf::Db_AddColumnSync (this=<optimized out>, cl=...) at controller/src/database/cassandra/cql/cql_if.cc:1152
#8 0x000000000055ea09 in cass::cql::CqlIf::Db_AddColumn (this=<optimized out>, cl=...) at controller/src/database/cassandra/cql/cql_if.cc:1148
#9 0x000000000052e528 in DbHandler::StatTableWrite (this=this@entry=0x104fa00, t2=t2@entry=172935007, statName=..., statAttr=..., ptag=..., stag=..., t1=t1@entry=0, unm=..., jsonline=..., ttl=ttl@entry=172800) at controller/src/analytics/db_handler.cc:863
#10 0x0000000000530018 in DbHandler::StatTableInsertTtl (this=this@entry=0x104fa00, ts=ts@entry=1450683984478939, statName=..., statAttr=..., attribs_tag=..., attribs=..., ttl=ttl@entry=172800) at controller/src/analytics/db_handler.cc:1019
#11 0x000000000053125f in DbHandler::FieldNamesTableInsert (this=this@entry=0x104fa00, timestamp=1450683984478939, table_prefix=..., field_name=..., field_val=..., ttl=ttl@entry=172800) at c...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/15968
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/15968
Committed: http://github.org/Juniper/contrail-controller/commit/78ca3fbbfc523b51ef6a375e02c583c856b29547
Submitter: Zuul
Branch: master

commit 78ca3fbbfc523b51ef6a375e02c583c856b29547
Author: Megh Bhatt <email address hidden>
Date: Tue Dec 22 01:50:20 2015 -0800

Fix contrail-collector and contrail-query-engine crash

1. Only allow Db_AddColumn() to proceed if DB is done initializing
2. Implement Db_UseColumnfamily() to check if the table meta
exists so that we do not hit assert later on when trying to
extract partition/clustering keys

Change-Id: I8d00c72ba45fa7db032956f47c74dad68a1bea57
Closes-Bug: #1528146

Ankit Jain (ankitja)
information type: Proprietary → Public
Revision history for this message
Ankit Jain (ankitja) wrote :

Hi Megh,

This is seen in our sanity again on build 2717 ..Could you please check it?

#0 0x00007f657aeb2cc9 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#0 0x00007f657aeb2cc9 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f657aeb60d8 in __GI_abort () at abort.c:89
#2 0x00007f657aeabb86 in __assert_fail_base (
    fmt=0x7f657affc830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x753678 "impl::GetCassTableClusteringKeyCount(session_.get(), keyspace_, table, &ck_count)",
    file=file@entry=0x7535c0 "controller/src/database/cassandra/cql/cql_if.cc", line=line@entry=933,
    function=function@entry=0x7549c0 "bool cass::cql::CqlIf::CqlIfImpl::IsTableStatic(const string&)") at assert.c:92
#3 0x00007f657aeabc32 in __GI___assert_fail (
    assertion=0x753678 "impl::GetCassTableClusteringKeyCount(session_.get(), keyspace_, table, &ck_count)",
    file=0x7535c0 "controller/src/database/cassandra/cql/cql_if.cc", line=933,
    function=0x7549c0 "bool cass::cql::CqlIf::CqlIfImpl::IsTableStatic(const string&)") at assert.c:101
#4 0x0000000000416e7e in ?? ()

Build : 3.0.0.0-2717
CoreLocation : /cs-shared/test_runs/nodea35/2016_02_27_22_04_39
cores : {'10.204.216.31': ['core.contrail-collec.11653.nodea35.1456592952', 'core.contrail-collec.11672.nodea35.1456592953', 'core.contrail-collec.8672.nodea35.1456592952']}
LogsLocation : http://10.204.216.50/Docs/logs/3.0.0.0-2717_2016_02_27_22_04_39/logs/
Report : http://10.204.216.50/Docs/logs/3.0.0.0-2717_2016_02_27_22_04_39/junit-noframes.html
Topology :
Config Nodes : [u'nodea35', u'nodea34']
Control Nodes : [u'nodea35', u'nodea34', u'nodec53']
Compute Nodes : [u'nodec54', u'nodec55', u'nodec56']
Openstack Node : nodea34
WebUI Node : nodec53
Analytics Nodes : [u'nodea35', u'nodec53']
Physical Devices : [u"'blr-mx1'"]

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.