Do not call BgpPeer::SetAdminState() directly from main thread

Bug #1570649 reported by Ananth Suryanarayana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
Medium
Ananth Suryanarayana

Bug Description

BgpPeer::SetAdminState() is suppose to be called only from the exclusive main thread. It calls StateMachine() routines which accesses its data structures that could be concurrently accessed by other tasks such as statemachine task. e.g. last_event_

Here is a crash, when last_event_ was being read, but got changed from another thread, resulting in test crash.

Thread 1 (Thread 0x2aaecbd59200 (LWP 32436)):
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:36
#1 0x00002aaecd8c2fa0 in std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00002aaecd8c367c in std::string::assign(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x0000000001381531 in PeerEventInfo::set_last_event (this=0x7fffdc3dd270, val=...) at build/debug/bgp/peer_info_types.h:614
#4 0x000000000137f73e in StateMachine::reset_last_info (this=0x2aaef0051ef0) at controller/src/bgp/state_machine.cc:1787
#5 0x000000000137bc10 in StateMachine::SetAdminState (this=0x2aaef0051ef0, down=false) at controller/src/bgp/state_machine.cc:1143
#6 0x0000000001208c38 in BgpPeer::SetAdminState (this=0x2aaef0051910, down=false) at controller/src/bgp/bgp_peer.cc:940
#7 0x0000000001068730 in GracefulRestartTest::BgpPeerUp (this=0x415e420, peer=0x2aaef0051910) at controller/src/bgp/test/graceful_restart_test.cc:979
#8 0x000000000106f8bf in GracefulRestartTest::GracefulRestartTestRun (this=0x415e420) at controller/src/bgp/test/graceful_restart_test.cc:1349
#9 0x00000000010763a2 in GracefulRestartTest_GracefulRestart_Flap_Some_2_3_Test::TestBody (this=0x415e420) at controller/src/bgp/test/graceful_restart_test.cc:1826
#10 0x000000000195da7e in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x415e420, method=&virtual testing::Test::TestBody(), location=0x1bc6033 "the test body") at /usr/src/gmock/gtest/src/gtest.cc:2078
#11 0x00000000019593a6 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x415e420, method=&virtual testing::Test::TestBody(), location=0x1bc6033 "the test body") at /usr/src/gmock/gtest/src/gtest.cc:2114
#12 0x0000000001941a37 in testing::Test::Run (this=0x415e420) at /usr/src/gmock/gtest/src/gtest.cc:2151
#13 0x0000000001942212 in testing::TestInfo::Run (this=0x3d0e510) at /usr/src/gmock/gtest/src/gtest.cc:2326
#14 0x00000000019428a2 in testing::TestCase::Run (this=0x3d0d0d0) at /usr/src/gmock/gtest/src/gtest.cc:2444
#15 0x0000000001948eec in testing::internal::UnitTestImpl::RunAllTests (this=0x3caa040) at /usr/src/gmock/gtest/src/gtest.cc:4261
#16 0x000000000195ea32 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x3caa040, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x1948c6e <testing::internal::UnitTestImpl::RunAllTests()>, location=0x1bc68f0 "auxiliary test code (environments or event listeners)") at /usr/src/gmock/gtest/src/gtest.cc:2078
#17 0x000000000195a01c in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x3caa040, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x1948c6e <testing::internal::UnitTestImpl::RunAllTests()>, location=0x1bc68f0 "auxiliary test code (environments or event listeners)") at /usr/src/gmock/gtest/src/gtest.cc:2114
#18 0x0000000001947bab in testing::UnitTest::Run (this=0x222e360 <testing::UnitTest::GetInstance()::instance>) at /usr/src/gmock/gtest/src/gtest.cc:3875
#19 0x000000000107d236 in RUN_ALL_TESTS () at build/include/gtest/gtest.h:2288
#20 0x000000000107aac4 in main (argc=3, argv=0x7fffdc3dea38) at controller/src/bgp/test/graceful_restart_test.cc:2419
(gdb)

Revision history for this message
Ananth Suryanarayana (anantha-l) wrote :

other tests such as bgp_server_test have the same issue as well

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/19335
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/19337
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19337
Committed: http://github.org/Juniper/contrail-controller/commit/186c8c64f47ad8656d6810038afa1c60e9d1cf6d
Submitter: Zuul
Branch: master

commit 186c8c64f47ad8656d6810038afa1c60e9d1cf6d
Author: Ananth Suryanarayana <email address hidden>
Date: Thu Apr 14 17:58:29 2016 -0700

Do not call BgpPeer::SetAdminState() from main thread

To avoid data corruption, add a work queue and use a task off bgp::Config
exclusive task. Code is generic and will support other such functionality
in future, as needed

Change-Id: I16a17b349bc5ac8adbc8777c2f7eb5a134785b4a
Closes-Bug: #1570649

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.