vrouter crashes on scaling VDNS which in-turn scale , VN, IPAM and VM’s

Bug #1560242 reported by manishkn
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Critical
Nipa
R2.21.x
Fix Committed
Critical
Nipa
R2.22.x
Fix Committed
Critical
Nipa
R3.0
Fix Committed
Critical
Nipa
Trunk
Fix Committed
Critical
Nipa

Bug Description

On scaling VDNS which in-turn needed to scale , VN, IPAM and VM’s, I see a vrouter crash.

Setup details: 10.87.141.33
/var/crashes/core.contrail-vroute.2014.cmbu-gravity-10.1458474864

Contrail version : 3.0.0.0-2725

(gdb) bt
#0 0x0000000000ec575c in XmppConnection::SetTo(std::string const&) ()
#1 0x0000000000ee0bee in XmppStateMachine::OnMessage(XmppSession*, XmppStanza::XmppMessage const*) ()
#2 0x0000000000eca7c8 in XmppConnection::ReceiveMsg(XmppSession*, std::string const&) ()
#3 0x0000000000edae69 in XmppSession::OnRead(boost::asio::const_buffer) ()
#4 0x00000000010a7ca0 in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, TcpSession, boost::asio::const_buffer>, boost::_bi::list2<boost::_bi::value<SslSession*>, boost::arg<1> > >, void, boost::asio::const_buffer>::invoke(boost::detail::function::function_buffer&, boost::asio::const_buffer) ()
#5 0x00000000010a8a1b in SslSession::SslReader::Run() ()
#6 0x0000000001193eec in TaskImpl::execute() ()
#7 0x00007f288e87bb3a in ?? () from /usr/lib/libtbb.so.2
#8 0x00007f288e877816 in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f288e876f4b in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f288e8730ff in ?? () from /usr/lib/libtbb.so.2
#11 0x00007f288e8732f9 in ?? () from /usr/lib/libtbb.so.2
#12 0x00007f288ea97182 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007f288dd7047d in clone () from /lib/x86_64-linux-gnu/libc.so.6

core file is stored in /cs-shared/bugs/

Thanks
Manish Krishnan

manishkn (manishkn)
summary: - vrouter crashes on scaling VDNS which in-turn needed to scale , VN, IPAM
- and VM’s
+ vrouter crashes on scaling VDNS which in-turn scale , VN, IPAM and VM’s
manishkn (manishkn)
tags: added: vdns vrouter
Jeba Paulaiyan (jebap)
information type: Proprietary → Public
tags: added: blocker
Revision history for this message
Nipa (nipak) wrote :

(gdb) bt
#0 size (this=0x78) at /usr/include/c++/4.8/bits/basic_string.h:716
#1 XmppConnection::SetTo (this=0x0, to=...) at controller/src/xmpp/xmpp_connection.cc:180
#2 0x0000000000ee0bee in XmppStateMachine::OnMessage (this=0x7f287802db20, session=session@entry=0x7f286096a030, msg=msg@entry=0x7f287805b240) at controller/src/xmpp/xmpp_state_machine.cc:1377
#3 0x0000000000eca7c8 in XmppConnection::ReceiveMsg (this=0x7f287802d8b0, session=0x7f286096a030, msg=...) at controller/src/xmpp/xmpp_connection.cc:515
#4 0x0000000000edae69 in XmppSession::OnRead (this=0x7f286096a030, buffer=...) at controller/src/xmpp/xmpp_session.cc:300
#5 0x00000000010a7ca0 in call<SslSession*, boost::asio::const_buffer> (u=<optimized out>, b1=<synthetic pointer>, this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:156
#6 operator()<SslSession*> (u=<optimized out>, a1=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:171
#7 operator()<boost::_mfi::mf1<void, TcpSession, boost::asio::const_buffer>, boost::_bi::list1<boost::asio::const_buffer&> > (a=<synthetic pointer>, f=..., this=<optimized out>)
    at /usr/include/boost/bind/bind.hpp:313
#8 operator()<boost::asio::const_buffer> (a1=<synthetic pointer>, this=<optimized out>) at /usr/include/boost/bind/bind_template.hpp:32
#9 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, TcpSession, boost::asio::const_buffer>, boost::_bi::list2<boost::_bi::value<SslSession*>, boost::arg<1> > >, void, boost::asio::const_buffer>::invoke (function_obj_ptr=..., a0=...) at /usr/include/boost/function/function_template.hpp:153
#10 0x00000000010a8a1b in operator() (a0=..., this=0x3badf28) at /usr/include/boost/function/function_template.hpp:767
#11 SslSession::SslReader::Run (this=0x3baded0) at controller/src/io/ssl_session.cc:25
#12 0x0000000001193eec in TaskImpl::execute (this=0x7f2887505340) at controller/src/base/task.cc:253
#13 0x00007f288e87bb3a in ?? () from /usr/lib/libtbb.so.2
#14 0x00007f288e877816 in ?? () from /usr/lib/libtbb.so.2
#15 0x00007f288e876f4b in ?? () from /usr/lib/libtbb.so.2
#16 0x00007f288e8730ff in ?? () from /usr/lib/libtbb.so.2
#17 0x00007f288e8732f9 in ?? () from /usr/lib/libtbb.so.2
#18 0x00007f288ea97182 in start_thread (arg=0x7f2886f32700) at pthread_create.c:312
#19 0x00007f288dd7047d in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#20 0x0000000000000000 in ?? ()
(gdb) f 2
#2 0x0000000000ee0bee in XmppStateMachine::OnMessage (this=0x7f287802db20, session=session@entry=0x7f286096a030, msg=msg@entry=0x7f287805b240) at controller/src/xmpp/xmpp_state_machine.cc:1377
1377 in controller/src/xmpp/xmpp_state_machine.cc
(gdb) p this
$141 = (XmppStateMachine * const) 0x7f287802db20
(gdb) p this->connection_
$142 = (XmppConnection *) 0x7f287802d8b0
(gdb) p this->session_
$143 = (XmppSession *) 0x7f28608169a0
(gdb) p this->connection_->session_
$144 = (XmppSession *) 0x0
(gdb) p this->session_->connection_
$145 = (XmppConnection *) 0x7f287802d8b0
(gdb)

Revision history for this message
Nipa (nipak) wrote :

Session (0x7f286096a030) on which the Read Event triggered as part of io::ReaderTask is not the same session that the xmpp state-machine is pointing to (0x7f28608169a0) as part of xmpp::StateMachine task.

This indicates a close was received on older session (0x7f286096a030) and processed as part of xmpp::StateMachine task and new session (0x7f28608169a0) is created, while we were in the middle of receiving data on the session (0x7f286096a030), hence the two tasks need to be made exclusive, i.e processsing on data and processing of close need to be exclusive.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20074
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20074
Committed: http://github.org/Juniper/contrail-controller/commit/3532c2fbb7f3b127dccf35451389c042a981f167
Submitter: Zuul
Branch: master

commit 3532c2fbb7f3b127dccf35451389c042a981f167
Author: Nipa Kumar <email address hidden>
Date: Tue May 10 16:01:32 2016 -0700

Add mutual exclusion on Agent between xmpp::StateMachine and io::ReaderTask.

TCP close event on a session is handled by xmpp::StateMachine task and
reads handled by io::ReaderTask, hence both tasks can be accessing the
same session at the same time and xmpp::StateMachine destroying the session
while io::ReaderTask reading it.

Change-Id: I09fbdf9b9ea6c31fb6072e7e2f8d40510f32dc03
Closes-Bug:1560242

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20106
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20107
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/20108
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20109
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20109
Committed: http://github.org/Juniper/contrail-controller/commit/7a1437607ee3fffcefc8ce46b129913cd3f9ca68
Submitter: Zuul
Branch: R2.22.x

commit 7a1437607ee3fffcefc8ce46b129913cd3f9ca68
Author: Nipa Kumar <email address hidden>
Date: Tue May 10 16:01:32 2016 -0700

Add mutual exclusion on Agent between xmpp::StateMachine and io::ReaderTask.

TCP close event on a session is handled by xmpp::StateMachine task and
reads handled by io::ReaderTask, hence both tasks can be accessing the
same session at the same time and xmpp::StateMachine destroying the session
while io::ReaderTask reading it.

Change-Id: I09fbdf9b9ea6c31fb6072e7e2f8d40510f32dc03
Closes-Bug:1560242

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20108
Committed: http://github.org/Juniper/contrail-controller/commit/4303b102a6b612ce1a9dd9f442c8b713e3aeb81b
Submitter: Zuul
Branch: R2.21.x

commit 4303b102a6b612ce1a9dd9f442c8b713e3aeb81b
Author: Nipa Kumar <email address hidden>
Date: Tue May 10 16:01:32 2016 -0700

Add mutual exclusion on Agent between xmpp::StateMachine and io::ReaderTask.

TCP close event on a session is handled by xmpp::StateMachine task and
reads handled by io::ReaderTask, hence both tasks can be accessing the
same session at the same time and xmpp::StateMachine destroying the session
while io::ReaderTask reading it.

Change-Id: I09fbdf9b9ea6c31fb6072e7e2f8d40510f32dc03
Closes-Bug:1560242

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20106
Committed: http://github.org/Juniper/contrail-controller/commit/a5dffe3176580125eb24545f6dcf9a2cb0199db5
Submitter: Zuul
Branch: R3.0

commit a5dffe3176580125eb24545f6dcf9a2cb0199db5
Author: Nipa Kumar <email address hidden>
Date: Tue May 10 16:01:32 2016 -0700

Add mutual exclusion on Agent between xmpp::StateMachine and io::ReaderTask.

TCP close event on a session is handled by xmpp::StateMachine task and
reads handled by io::ReaderTask, hence both tasks can be accessing the
same session at the same time and xmpp::StateMachine destroying the session
while io::ReaderTask reading it.

Change-Id: I09fbdf9b9ea6c31fb6072e7e2f8d40510f32dc03
Closes-Bug:1560242

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20107
Committed: http://github.org/Juniper/contrail-controller/commit/ce74bc88a505e2b27898509d25ca952843f77163
Submitter: Zuul
Branch: R2.20

commit ce74bc88a505e2b27898509d25ca952843f77163
Author: Nipa Kumar <email address hidden>
Date: Tue May 10 16:01:32 2016 -0700

Add mutual exclusion on Agent between xmpp::StateMachine and io::ReaderTask.

TCP close event on a session is handled by xmpp::StateMachine task and
reads handled by io::ReaderTask, hence both tasks can be accessing the
same session at the same time and xmpp::StateMachine destroying the session
while io::ReaderTask reading it.

Change-Id: I09fbdf9b9ea6c31fb6072e7e2f8d40510f32dc03
Closes-Bug:1560242

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.