2.23: vRouter crash @AgentDnsXmppChannel::WriteReadyCb

Bug #1561843 reported by amit surana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Nipa
R2.21.x
Fix Committed
High
Nipa
R2.22.x
Fix Committed
High
Nipa
R3.0
Fix Committed
High
Nipa
Trunk
Fix Committed
High
Nipa

Bug Description

2.23 b 126, 7.1 Centos Juno

Agent crashed after couple hours of solution script run. BT:

#0 0x00002b98b73e95f7 in raise () from /lib64/libc.so.6
#1 0x00002b98b73eace8 in abort () from /lib64/libc.so.6
#2 0x00002b98b7429317 in __libc_message () from /lib64/libc.so.6
#3 0x00002b98b7431023 in _int_free () from /lib64/libc.so.6
#4 0x000000000186e2e7 in AgentDnsXmppChannel::WriteReadyCb (this=0x2b98f01abb60,
    msg=0x2b98e34096c0 "<?xml version=\"1.0\"?>\n<iq type=\"set\" from=\"contrail76.softlayer.com/dns\" to=\"<email address hidden>/dns-peer\" id=\"29969\">\n<dns transid=\"29969\">\n<update>\n<virtual-dns>default-domain:tenant322-"..., ec=...) at controller/src/vnsw/agent/controller/controller_dns.cc:81
#5 0x000000000187011a in boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>::operator() (this=0x2b99700d0550, p=0x2b98f01abb60,
    a1=0x2b98e34096c0 "<?xml version=\"1.0\"?>\n<iq type=\"set\" from=\"contrail76.softlayer.com/dns\" to=\"<email address hidden>/dns-peer\" id=\"29969\">\n<dns transid=\"29969\">\n<update>\n<virtual-dns>default-domain:tenant322-"..., a2=...) at /usr/include/boost/bind/mem_fn_template.hpp:280

Tags: vrouter soln
Nipa (nipak)
Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Nipa (nipak)
amit surana (asurana-t)
tags: added: soln
Revision history for this message
Nipa (nipak) wrote :
Download full text (5.0 KiB)

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal 6, Aborted.
#0 0x00002b3f887cd5f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install contrail-vrouter-agent-2.23-126.el7.centos.x86_64
(gdb) bt
#0 0x00002b3f887cd5f7 in raise () from /lib64/libc.so.6
#1 0x00002b3f887cece8 in abort () from /lib64/libc.so.6
#2 0x00002b3f8880d317 in __libc_message () from /lib64/libc.so.6
#3 0x00002b3f88815023 in _int_free () from /lib64/libc.so.6
#4 0x000000000186e2e7 in AgentDnsXmppChannel::WriteReadyCb(unsigned char*, boost::system::error_code const&) ()
#5 0x000000000187011a in boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>::operator()(AgentDnsXmppChannel*, unsigned char*, boost::system::error_code const&) const ()
#6 0x000000000186fdbc in void boost::_bi::list3<boost::_bi::value<AgentDnsXmppChannel*>, boost::_bi::value<unsigned char*>, boost::arg<1> >::operator()<boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>, boost::_bi::list1<boost::system::error_code const&> >(boost::_bi::type<void>, boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>&, boost::_bi::list1<boost::system::error_code const&>&, int) ()
#7 0x000000000186fa76 in void boost::_bi::bind_t<void, boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>, boost::_bi::list3<boost::_bi::value<AgentDnsXmppChannel*>, boost::_bi::value<unsigned char*>, boost::arg<1> > >::operator()<boost::system::error_code>(boost::system::error_code const&) ()
#8 0x000000000186f7a2 in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf2<void, AgentDnsXmppChannel, unsigned char*, boost::system::error_code const&>, boost::_bi::list3<boost::_bi::value<AgentDnsXmppChannel*>, boost::_bi::value<unsigned char*>, boost::arg<1> > >, void, boost::system::error_code const&>::invoke(boost::detail::function::function_buffer&, boost::system::error_code const&) ()
#9 0x0000000001ab9851 in boost::function1<void, boost::system::error_code const&>::operator()(boost::system::error_code const&) const ()
#10 0x0000000001ab8429 in XmppChannelMux::WriteReady(boost::system::error_code const&) ()
#11 0x0000000001a5d6e5 in XmppConnection::WriteReady() ()
#12 0x0000000001a73d01 in XmppSession::ProcessWriteReady() ()
#13 0x0000000001a6d876 in XmppConnectionManager::DequeueSession(boost::intrusive_ptr<TcpSession>) ()
#14 0x0000000001a7003e in boost::_mfi::mf1<bool, XmppConnectionManager, boost::intrusive_ptr<TcpSession> >::operator()(XmppConnectionManager*, boost::intrusive_ptr<TcpSession>) const ()
#15 0x0000000001a6fab6 in bool boost::_bi::list2<boost::_bi::value<XmppConnectionManager*>, boost::arg<1> >::operator()<bool, boost::_mfi::mf1<bool, XmppConnectionManager, boost::intrusive_ptr<TcpSession> >, boost::_bi::list1<boost::intrusive_ptr<TcpSession>&> >(boost::_bi::type<bool>, boost::_mfi::mf1<bool, XmppConnectionManager, boost::intrusive_ptr<TcpSessi...

Read more...

Revision history for this message
Nipa (nipak) wrote :

This happens when the current connection from dns to xmpp-server which is active is to be teared down as informed by Discovery.
UnRegisterWriteReady is not sufficient as there could be another write queued before the ManagedDelete of the object is triggered that closes the TCP connection and sends a close to the XMPP state machine.

AgentDnsXmppChannel here is deleted as soon as the change is indicated by discovery.

We will need to defer deletion of AgentDnsXmppChannel after a tcp close is send to the XmppStateMachine, which will ensure no more writes can be triggered.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/19001
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/19003
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19001
Committed: http://github.org/Juniper/contrail-controller/commit/b969692e4931195fb32e68e791cbaa1099e68c35
Submitter: Zuul
Branch: master

commit b969692e4931195fb32e68e791cbaa1099e68c35
Author: Nipa Kumar <email address hidden>
Date: Fri Apr 1 13:48:12 2016 -0700

Handling of socket block on a TCP write by AgentDnsXmppChannel.

Application should not free data sent over TCP socket as the TCP library takes care
of memory management of the data.

On a socket write block xmpp provides the facility to register a callback so application
can handle the case. In case of agent, the inherent TCP library buffering should be
sufficient and no action needed from the application as there is only one writer(DNS)
on the socket.

Change-Id: I73993d58ed84cdce97217038c3da17ac3f5142d2
Closes-Bug: 1561843

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/19003
Committed: http://github.org/Juniper/contrail-controller/commit/204b64f84bcec8d0997780b57627e465985bce96
Submitter: Zuul
Branch: R2.20

commit 204b64f84bcec8d0997780b57627e465985bce96
Author: Nipa Kumar <email address hidden>
Date: Fri Apr 1 13:48:12 2016 -0700

Handling of socket block on a TCP write by AgentDnsXmppChannel.

Application should not free data sent over TCP socket as the TCP library takes care
of memory management of the data.

On a socket write block xmpp provides the facility to register a callback so application
can handle the case. In case of agent, the inherent TCP library buffering should be
sufficient and no action needed from the application as there is only one writer(DNS)
on the socket.

Change-Id: I73993d58ed84cdce97217038c3da17ac3f5142d2
Closes-Bug: 1561843

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/19045
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/19046
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/19047
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19047
Committed: http://github.org/Juniper/contrail-controller/commit/5a4edf0d73832ac1f32cbcee73804a95c0d6c5bf
Submitter: Zuul
Branch: R3.0

commit 5a4edf0d73832ac1f32cbcee73804a95c0d6c5bf
Author: Nipa Kumar <email address hidden>
Date: Fri Apr 1 13:48:12 2016 -0700

Handling of socket block on a TCP write by AgentDnsXmppChannel.

Application should not free data sent over TCP socket as the TCP library takes care
of memory management of the data.

On a socket write block xmpp provides the facility to register a callback so application
can handle the case. In case of agent, the inherent TCP library buffering should be
sufficient and no action needed from the application as there is only one writer(DNS)
on the socket.

Change-Id: I73993d58ed84cdce97217038c3da17ac3f5142d2
Closes-Bug: 1561843

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/19046
Committed: http://github.org/Juniper/contrail-controller/commit/b34548e563ea6ff305615d638e72eb1a764cfd05
Submitter: Zuul
Branch: R2.22.x

commit b34548e563ea6ff305615d638e72eb1a764cfd05
Author: Nipa Kumar <email address hidden>
Date: Fri Apr 1 13:48:12 2016 -0700

Handling of socket block on a TCP write by AgentDnsXmppChannel.

Application should not free data sent over TCP socket as the TCP library takes care
of memory management of the data.

On a socket write block xmpp provides the facility to register a callback so application
can handle the case. In case of agent, the inherent TCP library buffering should be
sufficient and no action needed from the application as there is only one writer(DNS)
on the socket.

Change-Id: I73993d58ed84cdce97217038c3da17ac3f5142d2
Closes-Bug: 1561843

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/19045
Committed: http://github.org/Juniper/contrail-controller/commit/d0bcada0a101b9982f48064cb6e5040fed102e87
Submitter: Zuul
Branch: R2.21.x

commit d0bcada0a101b9982f48064cb6e5040fed102e87
Author: Nipa Kumar <email address hidden>
Date: Fri Apr 1 13:48:12 2016 -0700

Handling of socket block on a TCP write by AgentDnsXmppChannel.

Application should not free data sent over TCP socket as the TCP library takes care
of memory management of the data.

On a socket write block xmpp provides the facility to register a callback so application
can handle the case. In case of agent, the inherent TCP library buffering should be
sufficient and no action needed from the application as there is only one writer(DNS)
on the socket.

Change-Id: I73993d58ed84cdce97217038c3da17ac3f5142d2
Closes-Bug: 1561843

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.