contrail-contrail crash in BgpServer::ConfigUpdater::ProcessNeighborConfig

Bug #1575517 reported by vageesan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Nischal Sheth
Trunk
Fix Committed
High
Nischal Sheth

Bug Description

contrail-control binary crashed with following backtrace.

core is in 10.84.5.112:/cs-shared/bugs/<bug-id>/

3.0.2.0-32~juno

(gdb) bt
#0 StateMachine::Shutdown (this=0x6d65747379736c69, subcode=subcode@entry=3)
    at controller/src/bgp/state_machine.cc:1125
#1 0x00000000008199e1 in BgpPeer::Clear (this=<optimized out>, subcode=subcode@entry=3)
    at controller/src/bgp/bgp_peer.cc:806
#2 0x0000000000897be8 in BgpServer::InsertPeer (this=0x27446a0, remote=...,
    peer=peer@entry=0x7fe49c2d9b40) at controller/src/bgp/bgp_server.cc:404
#3 0x000000000089caa8 in BgpServer::ConfigUpdater::ProcessNeighborConfig (this=0x274cfd0,
    neighbor_config=0x7fe49c27eb10, event=BgpConfigManager::CFG_ADD)
    at controller/src/bgp/bgp_server.cc:186
#4 0x00000000007fad09 in operator() (a1=BgpConfigManager::CFG_ADD, a0=0x7fe49c27eb10,
    this=0x274cb20) at /usr/include/boost/function/function_template.hpp:767
#5 BgpConfigManager::Notify<BgpNeighborConfig> (this=this@entry=0x274cad0,
    config=config@entry=0x7fe49c27eb10, event=event@entry=BgpConfigManager::CFG_ADD)
    at controller/src/bgp/bgp_config.cc:434
#6 0x000000000090b34c in BgpIfmapInstanceConfig::AddNeighbor (this=0x7fe4a400e5d0,
    manager=manager@entry=0x274cad0, neighbor=0x7fe49c27eb10)
    at controller/src/bgp/bgp_config_ifmap.cc:1107
#7 0x000000000090cf90 in BgpIfmapPeeringConfig::Update (this=this@entry=0x7fe434178cd0,
    manager=manager@entry=0x274cad0, peering=<optimized out>)
    at controller/src/bgp/bgp_config_ifmap.cc:461
#8 0x0000000000911710 in BgpIfmapConfigManager::ProcessBgpPeering (this=0x274cad0, delta=...)
    at controller/src/bgp/bgp_config_ifmap.cc:2096
#9 0x00000000009052d7 in operator() (a0=..., this=0x274cf78)
    at /usr/include/boost/function/function_template.hpp:767
#10 BgpIfmapConfigManager::ProcessChanges (this=this@entry=0x274cad0, change_list=...)
    at controller/src/bgp/bgp_config_ifmap.cc:2110
#11 0x0000000000905394 in BgpIfmapConfigManager::ConfigHandler (this=0x274cad0)
    at controller/src/bgp/bgp_config_ifmap.cc:2124
#12 0x000000000068d2bf in operator() (this=<optimized out>)
    at /usr/include/boost/function/function_template.hpp:767
#13 TaskTrigger::WorkerTask::Run (this=0x7fe49c192e40) at controller/src/base/task_trigger.cc:19
#14 0x000000000068807c in TaskImpl::execute (this=0x7fe4bbd3e740) at controller/src/base/task.cc:256
#15 0x00007fe4b9eb2b3a in ?? () from /usr/lib/libtbb.so.2
#16 0x00007fe4b9eae816 in ?? () from /usr/lib/libtbb.so.2
#17 0x00007fe4b9eadf4b in ?? () from /usr/lib/libtbb.so.2
#18 0x00007fe4b9eaa0ff in ?? () from /usr/lib/libtbb.so.2
#19 0x00007fe4b9eaa2f9 in ?? () from /usr/lib/libtbb.so.2
#20 0x00007fe4ba0ce182 in start_thread (arg=0x7fe4b1953700) at pthread_create.c:312
#21 0x00007fe4b919f47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)

Revision history for this message
amit surana (asurana-t) wrote :

Crash was seen on scaled BGPaaS setup. 200 BGPaaS clients were configured; crash was seen when some new clients were being added/deleted in parallel.

Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Nischal Sheth (nsheth)
Revision history for this message
amit surana (asurana-t) wrote :

restart of config services consistently recreates this crash (220 BGPaaS clients configured).

Nischal Sheth (nsheth)
information type: Proprietary → Private
information type: Private → Public
Jeba Paulaiyan (jebap)
tags: added: blocker
Nischal Sheth (nsheth)
tags: added: bgpaas
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20078
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20079
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20079
Committed: http://github.org/Juniper/contrail-controller/commit/620b8ca43906c04cf27884b4737ff4f99dea5660
Submitter: Zuul
Branch: master

commit 620b8ca43906c04cf27884b4737ff4f99dea5660
Author: Nischal Sheth <email address hidden>
Date: Tue May 10 14:24:24 2016 -0700

Handle recreate of deleted BGPaaS peer prior to destroy

The basic code path to handle this scenario is same as for regular
peers. However, there's a bug in updating EndpointToBgpPeerList in
in the BgpServer. A BGPaaS peer correctly gets removed from this
list when it's deleted. However, it gets added back when processing
the new config if the peer has not been destroyed by then. This can
later result in access to freed memory if the source port is reused.

Fix consists of the following changes:

- Return NULL from PeerManager::PeerLocate if there's a deleted peer
- Do not manipulate EndpointToBgpPeerList if PeerLocate returns NULL
- Insert the peer into EndpointToBgpPeerList when it's resurrected

Change-Id: I8f46878dbabcc69079ce85f20588fd766143d70e
Closes-Bug: 1575517

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20078
Committed: http://github.org/Juniper/contrail-controller/commit/547e7728f0065e4c7db6f2c3d43844d49490ee78
Submitter: Zuul
Branch: R3.0

commit 547e7728f0065e4c7db6f2c3d43844d49490ee78
Author: Nischal Sheth <email address hidden>
Date: Tue May 10 14:24:24 2016 -0700

Handle recreate of deleted BGPaaS peer prior to destroy

The basic code path to handle this scenario is same as for regular
peers. However, there's a bug in updating EndpointToBgpPeerList in
in the BgpServer. A BGPaaS peer correctly gets removed from this
list when it's deleted. However, it gets added back when processing
the new config if the peer has not been destroyed by then. This can
later result in access to freed memory if the source port is reused.

Fix consists of the following changes:

- Return NULL from PeerManager::PeerLocate if there's a deleted peer
- Do not manipulate EndpointToBgpPeerList if PeerLocate returns NULL
- Insert the peer into EndpointToBgpPeerList when it's resurrected

Change-Id: I8f46878dbabcc69079ce85f20588fd766143d70e
Closes-Bug: 1575517

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.