Build 2689: Collector crashed at DbHandler::CreateTables()

Bug #1529563 reported by Ankit Jain
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
High
Megh Bhatt

Bug Description

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/contrail-collector...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 5900]
[New LWP 5676]
[New LWP 5674]
[New LWP 9573]
[New LWP 5901]
[New LWP 5653]
[New LWP 5898]
[New LWP 5677]
[New LWP 5688]
[New LWP 9574]
[New LWP 5675]
[New LWP 5687]
[New LWP 5689]
[New LWP 5899]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f3567b80cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", line 63, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named 'libstdcxx'
(gdb) bt
#0 0x00007f3567b80cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f3567b840d8 in __GI_abort () at abort.c:89
#2 0x00007f3567b79b86 in __assert_fail_base (fmt=0x7f3567cca830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x70b894 "(0)",
    file=file@entry=0x70bc08 "controller/src/analytics/db_handler.cc", line=line@entry=268, function=function@entry=0x711500 "bool DbHandler::CreateTables()") at assert.c:92
#3 0x00007f3567b79c32 in __GI___assert_fail (assertion=0x70b894 "(0)", file=0x70bc08 "controller/src/analytics/db_handler.cc", line=268, function=0x711500 "bool DbHandler::CreateTables()")
    at assert.c:101
#4 0x000000000052b843 in ?? ()
#5 0x000000000052b946 in ?? ()
#6 0x0000000000534958 in ?? ()
#7 0x0000000000534e49 in ?? ()
#8 0x00000000006b2c39 in ?? ()
#9 0x00000000006ac100 in ?? ()
#10 0x00007f3569108b3a in ?? () from /usr/lib/libtbb.so.2
#11 0x00007f3569104816 in ?? () from /usr/lib/libtbb.so.2
#12 0x00007f3569103f4b in ?? () from /usr/lib/libtbb.so.2
#13 0x00007f35691000ff in ?? () from /usr/lib/libtbb.so.2
#14 0x00007f35691002f9 in ?? () from /usr/lib/libtbb.so.2
#15 0x00007f3569324182 in start_thread (arg=0x7f355d72f700) at pthread_create.c:312
#16 0x00007f3567c4447d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Core copied @ bhushana@mayamruga:~/Documents/technical/bugs/(bug-id)
username : bhushana
pass: bhu@123

Tags: analytics
Ankit Jain (ankitja)
tags: added: analytics
Ankit Jain (ankitja)
information type: Proprietary → Public
no longer affects: juniperopenstack/r3.0
Revision history for this message
Megh Bhatt (meghb) wrote :

Hi Ankit,
Is this happening often? If so can you please provide the /var/log/contrail/contrail-collector.log when the core is hit. It seems that database add of column is failing but need to figure out the error code from the logs. I will also put in fix for stats to count the failures and/or retries but we also need to know the error.

Thanks

Megh

Revision history for this message
Ankit Jain (ankitja) wrote :

Hi Megh,

Copied the logs to bhushana@mayamruga:~/Documents/technical/bugs//1529563 folder
You can also check this @ nodeg32.

Revision history for this message
Megh Bhatt (meghb) wrote :

Logs seem to have rolled over:

root@nodeg32:~# ls -lart /var/log/contrail/contrail-collector.log*
-rw-r--r-- 1 contrail contrail 54344 Jan 25 06:25 /var/log/contrail/contrail-collector.log.4.gz
-rw-r--r-- 1 contrail contrail 70827 Jan 26 06:24 /var/log/contrail/contrail-collector.log.3.gz
-rw-r--r-- 1 contrail contrail 72925 Jan 27 06:24 /var/log/contrail/contrail-collector.log.2.gz
-rw-r--r-- 1 contrail contrail 1052275 Jan 28 00:50 /var/log/contrail/contrail-collector.log.10
-rw-r--r-- 1 contrail contrail 1049030 Jan 28 00:50 /var/log/contrail/contrail-collector.log.9
-rw-r--r-- 1 contrail contrail 1048780 Jan 28 00:50 /var/log/contrail/contrail-collector.log.8
-rw-r--r-- 1 contrail contrail 1048815 Jan 28 00:50 /var/log/contrail/contrail-collector.log.7
-rw-r--r-- 1 contrail contrail 1049167 Jan 28 00:50 /var/log/contrail/contrail-collector.log.6
-rw-r--r-- 1 contrail contrail 1104998 Jan 28 00:50 /var/log/contrail/contrail-collector.log.5
-rw-r--r-- 1 contrail contrail 1049177 Jan 28 00:50 /var/log/contrail/contrail-collector.log.4
-rw-r--r-- 1 contrail contrail 1048757 Jan 28 00:50 /var/log/contrail/contrail-collector.log.3
-rw-r--r-- 1 contrail contrail 1051291 Jan 28 00:51 /var/log/contrail/contrail-collector.log.2
-rw-r--r-- 1 contrail contrail 1049072 Jan 28 00:51 /var/log/contrail/contrail-collector.log.1
-rw-r--r-- 1 contrail contrail 700709 Jan 28 00:51 /var/log/contrail/contrail-collector.log
root@nodeg32:~# ls -lart /var/crashes/*coll*
-rw------- 1 contrail contrail 94343168 Jan 23 01:35 /var/crashes/core.contrail-collec.2202.nodeg32.1453493143
-rw------- 1 contrail contrail 109600768 Jan 23 01:35 /var/crashes/core.contrail-collec.4099.nodeg32.1453493154
-rw------- 1 contrail contrail 80060416 Jan 23 01:35 /var/crashes/core.contrail-collec.4311.nodeg32.1453493156
-rw------- 1 contrail contrail 80060416 Jan 23 01:35 /var/crashes/core.contrail-collec.4323.nodeg32.1453493157
-rw------- 1 contrail contrail 80060416 Jan 23 01:35 /var/crashes/core.contrail-collec.4414.nodeg32.1453493159
-rw------- 1 contrail contrail 80060416 Jan 23 01:36 /var/crashes/core.contrail-collec.4532.nodeg32.1453493162
root@nodeg32:~# gdb /usr/bin/contrail-collector /var/crashes/core.contrail-collec.2202.nodeg32.1453493143

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/17040
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17040
Committed: http://github.org/Juniper/contrail-controller/commit/68a84df4ea8684cca591f2623062b4ba5be7415f
Submitter: Zuul
Branch: master

commit 68a84df4ea8684cca591f2623062b4ba5be7415f
Author: Megh Bhatt <email address hidden>
Date: Tue Feb 9 11:10:41 2016 -0800

Retry Db_AddColumnSync failure in CreateTables

It is observed that sometimes the cassandra connection/session returns
no host available error on startup after adding few tables when trying
to update the SystemObjectTable in CreateTables. We need to retry in
this scenario instead of asserting.

Change-Id: I09f16367c40101faced382042d8a301c6764a921
Closes-Bug: #1529563

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.