BGPaaS: control-node crash at BgpServer::UnregisterPeer(BgpPeer*)

Bug #1534247 reported by amit surana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
Medium
Nischal Sheth
Trunk
Fix Committed
Medium
Nischal Sheth

Bug Description

contrail 3.0-2697.

core file: 10.84.5.112:/cs-shared/bugs/1534247/

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-control'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007faecb259cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007faecb259cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007faecb25d0d8 in __GI_abort () at abort.c:89
#2 0x00007faecb252b86 in __assert_fail_base (fmt=0x7faecb3a3830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0xc6a75d "count == 1", file=file@entry=0xc6a7e0 "controller/src/bgp/bgp_server.cc", line=line@entry=367,
    function=function@entry=0xc6aae0 <BgpServer::UnregisterPeer(BgpPeer*)::__PRETTY_FUNCTION__> "void BgpServer::UnregisterPeer(BgpPeer*)")
    at assert.c:92
#3 0x00007faecb252c32 in __GI___assert_fail (assertion=0xc6a75d "count == 1", file=0xc6a7e0 "controller/src/bgp/bgp_server.cc", line=367,
    function=0xc6aae0 <BgpServer::UnregisterPeer(BgpPeer*)::__PRETTY_FUNCTION__> "void BgpServer::UnregisterPeer(BgpPeer*)") at assert.c:101
#4 0x00000000007f2c2b in BgpServer::UnregisterPeer (this=0x29b7c80, peer=peer@entry=0x7fae78027910) at controller/src/bgp/bgp_server.cc:367
#5 0x000000000078c4d2 in PostCloseRelease (this=0x7fae78027910) at controller/src/bgp/bgp_peer.cc:711
#6 BgpPeer::DeleteActor::Destroy (this=0x7fae78016ef0) at controller/src/bgp/bgp_peer.cc:245
#7 0x0000000000bdacc7 in LifetimeManager::DeleteExecutor (this=<optimized out>, actor_ref=...) at controller/src/base/lifetime.cc:228
#8 0x0000000000bdb7d0 in operator() (a1=..., p=<optimized out>, this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:165
#9 operator()<bool, boost::_mfi::mf1<bool, LifetimeManager, LifetimeManager::LifetimeActorRef>, boost::_bi::list1<LifetimeManager::LifetimeActorRef&> > (a=<synthetic pointer>, f=..., this=<optimized out>) at /usr/include/boost/bind/bind.hpp:303
#10 operator()<LifetimeManager::LifetimeActorRef> (a1=<synthetic pointer>, this=<optimized out>)
    at /usr/include/boost/bind/bind_template.hpp:32
#11 boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<bool, boost::_mfi::mf1<bool, LifetimeManager, LifetimeManager::LifetimeActorRef>, boost::_bi::list2<boost::_bi::value<LifetimeManager*>, boost::arg<1> > >, bool, LifetimeManager::LifetimeActorRef>::invoke (
    function_obj_ptr=..., a0=...) at /usr/include/boost/function/function_template.hpp:132
#12 0x0000000000bdd65f in operator() (a0=..., this=0x7faec2accb40) at /usr/include/boost/function/function_template.hpp:767
#13 RunQueue (this=0x7faea80f6fd0) at controller/src/base/queue_task.h:81
#14 QueueTaskRunner<LifetimeManager::LifetimeActorRef, WorkQueue<LifetimeManager::LifetimeActorRef> >::Run (this=0x7faea80f6fd0)
    at controller/src/base/queue_task.h:64
#15 0x0000000000beeb30 in TaskImpl::execute (this=0x7faec4a79940) at controller/src/base/task.cc:238
#16 0x00007faecc030b3a in ?? () from /usr/lib/libtbb.so.2
#17 0x00007faecc02c816 in ?? () from /usr/lib/libtbb.so.2
#18 0x00007faecc02bf4b in ?? () from /usr/lib/libtbb.so.2
#19 0x00007faecc0280ff in ?? () from /usr/lib/libtbb.so.2
#20 0x00007faecc0282f9 in ?? () from /usr/lib/libtbb.so.2
#21 0x00007faecc24c182 in start_thread (arg=0x7faec2acd700) at pthread_create.c:312
#22 0x00007faecb31d47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

amit surana (asurana-t)
description: updated
Revision history for this message
amit surana (asurana-t) wrote :
Download full text (3.1 KiB)

seen again on 3.0 b 2717. core added to the same directory.

(gdb) bt
#0 0x00007f658fbcbcc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f658fbcf0d8 in __GI_abort () at abort.c:89
#2 0x00007f658fbc4b86 in __assert_fail_base (fmt=0x7f658fd15830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0xcc5e3d "count == 1", file=file@entry=0xcc5ec0 "controller/src/bgp/bgp_server.cc", line=line@entry=370,
    function=function@entry=0xcc61c0 <BgpServer::UnregisterPeer(BgpPeer*)::__PRETTY_FUNCTION__> "void BgpServer::UnregisterPeer(BgpPeer*)") at assert.c:92
#3 0x00007f658fbc4c32 in __GI___assert_fail (assertion=0xcc5e3d "count == 1", file=0xcc5ec0 "controller/src/bgp/bgp_server.cc", line=370,
    function=0xcc61c0 <BgpServer::UnregisterPeer(BgpPeer*)::__PRETTY_FUNCTION__> "void BgpServer::UnregisterPeer(BgpPeer*)") at assert.c:101
#4 0x000000000085365b in BgpServer::UnregisterPeer (this=0x2f5a250, peer=peer@entry=0x7f656c5d54e0) at controller/src/bgp/bgp_server.cc:370
#5 0x00000000007ec392 in PostCloseRelease (this=0x7f656c5d54e0) at controller/src/bgp/bgp_peer.cc:718
#6 BgpPeer::DeleteActor::Destroy (this=0x7f656c169c80) at controller/src/bgp/bgp_peer.cc:247
#7 0x0000000000c2fba7 in LifetimeManager::DeleteExecutor (this=<optimized out>, actor_ref=...) at controller/src/base/lifetime.cc:229
#8 0x0000000000c2ff20 in operator() (a1=..., p=<optimized out>, this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:165
#9 operator()<bool, boost::_mfi::mf1<bool, LifetimeManager, LifetimeManager::LifetimeActorRef>, boost::_bi::list1<LifetimeManager::LifetimeActorRef&> > (
    a=<synthetic pointer>, f=..., this=<optimized out>) at /usr/include/boost/bind/bind.hpp:303
#10 operator()<LifetimeManager::LifetimeActorRef> (a1=<synthetic pointer>, this=<optimized out>) at /usr/include/boost/bind/bind_template.hpp:32
#11 boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<bool, boost::_mfi::mf1<bool, LifetimeManager, LifetimeManager::LifetimeActorRef>, boost::_bi::list2<boost::_bi::value<LifetimeManager*>, boost::arg<1> > >, bool, LifetimeManager::LifetimeActorRef>::invoke (function_obj_ptr=..., a0=...)
    at /usr/include/boost/function/function_template.hpp:132
#12 0x0000000000c3228f in operator() (a0=..., this=0x7f6578ff2b30) at /usr/include/boost/function/function_template.hpp:767
#13 RunQueue (this=0x7f6530023db0) at controller/src/base/queue_task.h:87
#14 QueueTaskRunner<LifetimeManager::LifetimeActorRef, WorkQueue<LifetimeManager::LifetimeActorRef> >::Run (this=0x7f6530023db0)
    at controller/src/base/queue_task.h:66
#15 0x0000000000c43adc in TaskImpl::execute (this=0x7f6589381c40) at controller/src/base/task.cc:253
#16 0x00007f65909a2b3a in ?? () from /usr/lib/libtbb.so.2
#17 0x00007f659099e816 in ?? () from /usr/lib/libtbb.so.2
#18 0x00007f659099df4b in ?? () from /usr/lib/libtbb.so.2
#19 0x00007f659099a0ff in ?? () from /usr/lib/libtbb.so.2
#20 0x00007f659099a2f9 in ?? () from /usr/lib/libtbb.so.2
#21 0x00007f6590bbe182 in start_thread (arg=0x7f6578ff3700) at pthread_create.c:312
#22 0x00007f658fc8f47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clon...

Read more...

Revision history for this message
Nischal Sheth (nsheth) wrote :

Same root cause as bug 1538318.
Will add more defensive checks in control node code.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/17885
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/17887
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17885
Committed: http://github.org/Juniper/contrail-controller/commit/e0586ec0d6d6328b3d7d31bec565ef2d8ca5f86c
Submitter: Zuul
Branch: master

commit e0586ec0d6d6328b3d7d31bec565ef2d8ca5f86c
Author: Nischal Sheth <email address hidden>
Date: Thu Feb 25 11:05:50 2016 -0800

Add defensive checks when processing bgp-peering links

Highlights:

- Add more instrumentation in BgpServer::[Register|Unregister]Peer
- No peering between bgpaas-server and non bgpaas-client
- No peering between non bgpaas-server and bgpaas-client
- No peering between bgpaas-server and bgpaas-client in different instance
- Add unit tests to verify new defensive checks

Change-Id: I9e680bc92d804b9a4977d24ebb5b269fa51fd7c8
Closes-Bug: 1534247

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17887
Committed: http://github.org/Juniper/contrail-controller/commit/b1ed06abe45ac5c97d957b7547f0f03e34e68187
Submitter: Zuul
Branch: R3.0

commit b1ed06abe45ac5c97d957b7547f0f03e34e68187
Author: Nischal Sheth <email address hidden>
Date: Thu Feb 25 11:05:50 2016 -0800

Add defensive checks when processing bgp-peering links

Highlights:

- Add more instrumentation in BgpServer::[Register|Unregister]Peer
- No peering between bgpaas-server and non bgpaas-client
- No peering between non bgpaas-server and bgpaas-client
- No peering between bgpaas-server and bgpaas-client in different instance
- Add unit tests to verify new defensive checks

Change-Id: I9e680bc92d804b9a4977d24ebb5b269fa51fd7c8
Closes-Bug: 1534247

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.