collector core on upgrade from r3.1.2.0-62 to r3.2.3.0-38 mitaka

Bug #1687475 reported by wenqing liang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.2
Fix Committed
High
Arvind
R3.2.3.x
Fix Committed
High
Arvind
Trunk
Fix Committed
High
Arvind

Bug Description

Collector cored on upgrade from r3.1.2.0-62 to r3.2.3.0-38 mitaka of a centos 7.2 contrail-ha cluster.

Pls see core and upgrade logs in /cs-shared/bugs/.

[root@b4s44 1687475]# pwd
/cs-shared/bugs/1687475
[root@b4s44 1687475]# ls
core.contrail-collec.13129.b7s32.englab.juniper.net.1493668694 upgrade_contrail_2017_05_01_12_42_37_421188.log
[root@b4s44 1687475]#

wenqing liang (wliang)
information type: Proprietary → Public
Jeba Paulaiyan (jebap)
tags: added: analytics
wenqing liang (wliang)
description: updated
Revision history for this message
Arvind (arvindv) wrote :

Please update the permission for core file

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

Permission for core is updated

Revision history for this message
Raj Reddy (rajreddy) wrote :

Please run gdb on the core file on the same system where core is seen..

Copy the binary from /github sandbox for the image used and run

gdb <copied-binary> <core-file>

Revision history for this message
wenqing liang (wliang) wrote :
Download full text (10.3 KiB)

Seen again on splitdb contrail ha centos 7.2 cluster after a contrail_upgrade from r3.1.2.0-62 to r3.2.3.0-38 mitaka:

[root@b7s37 store]# contrail-status
== Contrail Analytics ==
Warning: supervisor-analytics.service changed on disk. Run 'systemctl daemon-reload' to reload units.
supervisor-analytics: active
contrail-alarm-gen:0 active
contrail-analytics-api initializing (UvePartitions:UVE-Aggregation[Partitions:0] connection down)
contrail-analytics-nodemgr active
contrail-collector initializing (KafkaPub:172.16.80.7:9092,172.16.80.7:9092,172.16.80.7:9092 connection down)
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active

== Contrail Database ==
contrail-database: active

== Contrail Supervisor Database ==
supervisor-database: active
contrail-database-nodemgr active
kafka active

========Run time service failures=============
/var/crashes/core.contrail-collec.6421.b7s37.englab.juniper.net.1493696169
/var/crashes/core.contrail-collec.32462.b7s37.englab.juniper.net.1493700876
[root@b7s37 store]#

The cores abd upgrade log are uploaded to /cs-shared/bugs/1687475:

[root@b4s44 1687475]# ls -ltr
total 595852
-rwxrwxrwx 1 root root 190263296 May 1 17:55 core.contrail-collec.13129.b7s32.englab.juniper.net.1493668694
-rwxrwxrwx 1 root root 1550837 May 1 17:56 upgrade_contrail_2017_05_01_12_42_37_421188.log
-rw------- 1 root root 230498304 May 2 02:25 core.contrail-collec.6421.b7s37.englab.juniper.net.1493696169
-rw------- 1 root root 184115200 May 2 02:26 core.contrail-collec.32462.b7s37.englab.juniper.net.1493700876
-rw-r--r-- 1 root root 1313847 May 2 02:33 upgrade_contrail_2017_05_01_21_42_51_003929.log.splitdb
[root@b4s44 1687475]#

[root@b7s37 store]# gdb ./vizd /var/crashes/core.contrail-collec.6421.b7s37.englab.juniper.net.1493696169

Core was generated by `/usr/bin/contrail-collector --conf_file /etc/contrail/contrail-keystone-auth.co'.
Program terminated with signal 6, Aborted.
#0 0x00002b23d87e75f7 in ?? ()
(gdb) bt
#0 0x00002b23d87e75f7 in ?? ()
#1 0x00002b23d87e8ce8 in ?? ()
#2 0x0000000000000020 in ?? ()
#3 0x0000000000000000 in ?? ()
(gdb)

[root@b7s37 store]# gdb ./vizd /var/crashes/core.contrail-collec.32462.b7s37.englab.juniper.net.1493700876

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector --conf_file /etc/contrail/contrail-keystone-auth.co'.
Program terminated with signal 6, Aborted.
#0 0x00002b25831e45f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install boost-filesystem-1.53.0-25.el7.x86_64 boost-program-options-1.53.0-25.el7.x86_64 boost-python-1.53.0-25.el7.x86_64 boost-regex-1.53.0-25.el7.x86_64 boost-system-1.53.0-25.el7.x86_64 cassandra-cpp-driver-2.2.0-1.el7.centos.x86_64 cyrus-sasl-lib-2.1.26-19.2.el7.x86_64 glibc-2.17-106.el7_2.6.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.13.2-10.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libcurl-7.29.0-25.el7.centos.x86_64 libdb-5.3.21-19.el7.x86_64 libgcc-4.8.5-4.el7.x86_64 libicu-50.1.2-15.el7.x86_64 libidn-1.28-4.e...

Revision history for this message
wenqing liang (wliang) wrote :
Download full text (3.3 KiB)

[root@b7s37 store]# gdb ./vizd /var/crashes/core.contrail-collec.32462.b7s37.englab.juniper.net.1493700876

(gdb) bt
#0 0x00002b25831e45f7 in raise () from /lib64/libc.so.6
#1 0x00002b25831e5ce8 in abort () from /lib64/libc.so.6
#2 0x00002b25831dd566 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00002b25831dd612 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000006aa8d2 in boost::spirit::char_encoding::ascii::isspace (ch=-30)
    at /usr/include/boost/spirit/home/support/char_encoding/ascii.hpp:256
#5 0x00000000006bf4ae in boost::spirit::char_class::classify<boost::spirit::char_encoding::ascii>::is<unsigned char> (
    ch=226 '\342') at /usr/include/boost/spirit/home/support/char_class.hpp:310
#6 0x00000000006be0dc in boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::space, boost::spirit::char_encoding::ascii> >::test<unsigned char, boost::spirit::unused_type const> (this=0x2b25d87fdeaf, ch=226 '\342')
    at /usr/include/boost/spirit/home/qi/char/char_class.hpp:69
#7 0x00000000006bcbca in boost::spirit::qi::char_parser<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::space, boost::spirit::char_encoding::ascii> >, char, char>::parse<unsigned char const*, boost::spirit::unused_type const, boost::spirit::unused_type, boost::spirit::unused_type const> (this=0x2b25d87fdeaf,
    first=@0x2b25d87fd4c0: 0x1743e05 "\342\224\234\342\224\200\062\071\063\060\064 /bin/bash /etc/rc.d/init.d/cassandra status\n<30>May 1 21:50:41 b7s37 cassandra: \342\224\224\342\224\200\062\071\063\060\070 systemctl status cassandra.service\n<30>May 1 21:50:41 b7s37 cassandra: May 01 18:23:38 b7s37.e"..., last=@0x2b25d87fde90: 0x17443c4 "", context=...,
    skipper=..., attr=...) at /usr/include/boost/spirit/home/qi/char/char_parser.hpp:68
#8 0x00000000006bad28 in boost::spirit::qi::skip_over<unsigned char const*, boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::space, boost::spirit::char_encoding::ascii> > > (
    first=@0x2b25d87fd4c0: 0x1743e05 "\342\224\234\342\224\200\062\071\063\060\064 /bin/bash /etc/rc.d/init.d/cassandra status\n<30>May 1 21:50:41 b7s37 cassandra: \342\224\224\342\224\200\062\071\063\060\070 systemctl status cassandra.service\n<30>May 1 21:50:41 b7s37 cassandra: May 01 18:23:38 b7s37.e"..., last=@0x2b25d87fde90: 0x17443c4 "", skipper=...)
    at /usr/include/boost/spirit/home/qi/skip_over.hpp:27
#9 0x00000000006c393f in boost::spirit::qi::lexeme_directive<boost::spirit::qi::kleene<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::char_, boost::spirit::char_encoding::standard> > > >::parse<unsigned char const*, boost::spirit::context<boost::fusion::cons<std::string&, boost::fusion::nil>, boost::fusion::vector0<void> >, boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::space, boost::spirit::char_encoding::ascii> >, std::string> (this=0x2b25d87fe318,
    first=@0x2b25d87fd4c0: 0x1743e05 "\342\224\234\342\224\200\062\071\063\060\064 /bin/bash /etc/rc.d/init.d/cassandra status\n<30>May 1 21:50:41 b7s37 cassandra: \342\224\224\342\224\200\062\071\063\060\070 systemctl status cassa...

Read more...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/31025
Submitter: Arvind (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/31057
Submitter: Arvind (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/31057
Committed: http://github.com/Juniper/contrail-controller/commit/685d13dfbc0729c6953fc3c7c1003f322a771bcb
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 685d13dfbc0729c6953fc3c7c1003f322a771bcb
Author: arvindvis <email address hidden>
Date: Fri May 5 10:09:22 2017 -0700

The syslog collector has issues with handling syslog messages
arriving in TCP so taking off the support. The reason being, unlike
UDP, TCP sockets read multiple syslog messages in one read until
the buffer gets full. This can toss our parsing code.
Closes-Bug:#1687475

Change-Id: I87c72e2154e5eecd19657a1de92c6c28b77fcd35

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/31170
Submitter: Arvind (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/31170
Committed: http://github.com/Juniper/contrail-controller/commit/f4ab3c1493f2ccb6c42be45c97583849815924f5
Submitter: Zuul (<email address hidden>)
Branch: master

commit f4ab3c1493f2ccb6c42be45c97583849815924f5
Author: arvindvis <email address hidden>
Date: Fri May 5 10:09:22 2017 -0700

The syslog collector has issues with handling syslog messages
arriving in TCP so taking off the support. The reason being, unlike
UDP, TCP sockets read multiple syslog messages in one read until
the buffer gets full. This can toss our parsing code.
Closes-Bug:#1687475

Change-Id: I87c72e2154e5eecd19657a1de92c6c28b77fcd35
(cherry picked from commit 685d13dfbc0729c6953fc3c7c1003f322a771bcb)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2.3.x

Review in progress for https://review.opencontrail.org/32572
Submitter: Vinay Vithal Mahuli (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/32572
Committed: http://github.com/Juniper/contrail-controller/commit/4b460f31e1985b2618c4d998325891ca55828820
Submitter: Zuul (<email address hidden>)
Branch: R3.2.3.x

commit 4b460f31e1985b2618c4d998325891ca55828820
Author: arvindvis <email address hidden>
Date: Fri May 5 10:09:22 2017 -0700

The syslog collector has issues with handling syslog messages
arriving in TCP so taking off the support. The reason being, unlike
UDP, TCP sockets read multiple syslog messages in one read until
the buffer gets full. This can toss our parsing code.
Closes-Bug:#1687475

Change-Id: I87c72e2154e5eecd19657a1de92c6c28b77fcd35

Megh Bhatt (meghb)
tags: added: releasenote
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.