[3.0.2.0-34 ] Agent crash @ AgentRouteTable::RetryDelete

Bug #1579180 reported by chhandak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
Critical
Prabhjot Singh Sethi
Trunk
Fix Committed
Critical
Prabhjot Singh Sethi

Bug Description

Observed the crash while deleting logical interface config in scale setup.
replaced the build binary with patch 9 . Both Agent and tor agent binary is replaced.

Backtrace
----------
(gdb) bt
#0 0x00007f13f072ccc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f13f07300d8 in __GI_abort () at abort.c:89
#2 0x00007f13f0725b86 in __assert_fail_base (fmt=0x7f13f0876830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x11da280 "node_algorithms::inited(to_insert)", file=file@entry=0x11da218 "/usr/include/boost/intrusive/list.hpp", line=line@entry=270,
    function=function@entry=0x12bb000 <boost::intrusive::list_impl<boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<DBEntryBase, boost::intrusive::list_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &DBEntryBase::chg_list_>, unsigned long, true> >::push_back(DBEntryBase&)::__PRETTY_FUNCTION__> "void boost::intrusive::list_impl<Config>::push_back(boost::intrusive::list_impl<Config>::reference) [with Config = boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<DBEntryBase, b"...) at assert.c:92
#3 0x00007f13f0725c32 in __GI___assert_fail (assertion=0x11da280 "node_algorithms::inited(to_insert)", file=0x11da218 "/usr/include/boost/intrusive/list.hpp", line=270,
    function=0x12bb000 <boost::intrusive::list_impl<boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<DBEntryBase, boost::intrusive::list_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &DBEntryBase::chg_list_>, unsigned long, true> >::push_back(DBEntryBase&)::__PRETTY_FUNCTION__> "void boost::intrusive::list_impl<Config>::push_back(boost::intrusive::list_impl<Config>::reference) [with Config = boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<DBEntryBase, b"...) at assert.c:101
#4 0x000000000109af58 in push_back (value=..., this=<optimized out>) at /usr/include/boost/intrusive/list.hpp:270
#5 DBTablePartBase::Notify (this=0x7f13cd422440, entry=0x5d50) at controller/src/db/db_table_partition.cc:25
#6 0x00000000009ccdcd in AgentRouteTable::RetryDelete (this=0x7f13b0ab1da0) at controller/src/vnsw/agent/oper/agent_route.cc:509
#7 0x0000000001098b7a in DBTableBase::Unregister (this=0x7f13b0ab1da0, listener=1) at controller/src/db/db_table.cc:186
#8 0x0000000000db714b in KSyncDBObject::UnregisterDb (this=this@entry=0x7f13e5862420, table=<optimized out>) at controller/src/ksync/ksync_object.cc:319
#9 0x0000000000cecbfc in RouteKSyncObject::~RouteKSyncObject (this=0x7f13e5862420, __in_chrg=<optimized out>)
    at controller/src/vnsw/agent/vrouter/ksync/route_ksync.cc:723
#10 0x0000000000cecc89 in RouteKSyncObject::~RouteKSyncObject (this=0x7f13e5862420, __in_chrg=<optimized out>)
    at controller/src/vnsw/agent/vrouter/ksync/route_ksync.cc:725
#11 0x0000000000dbb9bf in KSyncObjectManager::Process (this=0x7f13cd487e90, event=0x7f13c57f2e10) at controller/src/ksync/ksync_object.cc:1520
#12 0x0000000000dc08af in operator() (a0=0x7f13c57f2e10, this=0x7f13e89adb30) at /usr/include/boost/function/function_template.hpp:767
#13 RunQueue (this=0x7f13c52fae80) at controller/src/base/queue_task.h:87
#14 QueueTaskRunner<KSyncObjectEvent*, WorkQueue<KSyncObjectEvent*> >::Run (this=0x7f13c52fae80) at controller/src/base/queue_task.h:66
#15 0x00000000011a847f in TaskImpl::execute (this=0x7f13e9f4ae40) at controller/src/base/task.cc:261
#16 0x00007f13f12fbb3a in ?? () from /usr/lib/libtbb.so.2
#17 0x00007f13f12f7816 in ?? () from /usr/lib/libtbb.so.2
#18 0x00007f13f12f6f4b in ?? () from /usr/lib/libtbb.so.2
#19 0x00007f13f12f30ff in ?? () from /usr/lib/libtbb.so.2
#20 0x00007f13f12f32f9 in ?? () from /usr/lib/libtbb.so.2
#21 0x00007f13f1517182 in start_thread (arg=0x7f13e89ae700) at pthread_create.c:312
#22 0x00007f13f07f047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

chhandak (chhandak)
Changed in juniperopenstack:
importance: Undecided → Critical
assignee: nobody → Hari Prasad Killi (haripk)
Jeba Paulaiyan (jebap)
information type: Proprietary → Public
Revision history for this message
chhandak (chhandak) wrote :

Copied the core to

-rwxrwxrwx 1 chhandak epbg 651563008 May 6 11:39 core.contrail-vroute.23837.5b7s4.1462558619
chhandak@ubuntu-build04:/auto/cores/1579180$ pwd

Revision history for this message
Prabhjot Singh Sethi (prabhjot) wrote :

issue caused because of DBEntry Notify being triggered in Tasks other than db:DBTable task.

resulting in parallel access and causing failure to push entry into change list

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20049
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Jeba Paulaiyan (jebap)
tags: added: blocker
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20089
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20049
Committed: http://github.org/Juniper/contrail-controller/commit/5141aeee5925ee0d1fa3d96ad2a422d3259a029b
Submitter: Zuul
Branch: R3.0

commit 5141aeee5925ee0d1fa3d96ad2a422d3259a029b
Author: Prabhjot Singh Sethi <email address hidden>
Date: Tue May 10 16:09:57 2016 +0530

Fix Agent crash on AgentRouteTable::RetryDelete

Issue:
------
unregister of RouteTable is causing Notify on VRF Entry
outside the context of db::DBTable task, and if this
happens in parallel from two different task running in
parallel can cause change list append issues with tries
for double enqueues

Fix:
----
Do not notify entry in any other context other than
db::DBTable task

Closes-Bug: 1579180
Related-Bug: 1574958
Change-Id: Ied06b53500f56a68a8e2443f704a647773686de2

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20089
Committed: http://github.org/Juniper/contrail-controller/commit/245d635d09b78433b65883f42c8d36ceacc6c6ae
Submitter: Zuul
Branch: master

commit 245d635d09b78433b65883f42c8d36ceacc6c6ae
Author: Prabhjot Singh Sethi <email address hidden>
Date: Tue May 10 16:09:57 2016 +0530

Fix Agent crash on AgentRouteTable::RetryDelete

Issue:
------
unregister of RouteTable is causing Notify on VRF Entry
outside the context of db::DBTable task, and if this
happens in parallel from two different task running in
parallel can cause change list append issues with tries
for double enqueues

Fix:
----
Do not notify entry in any other context other than
db::DBTable task

Closes-Bug: 1579180
Related-Bug: 1574958
Change-Id: Ied06b53500f56a68a8e2443f704a647773686de2
(cherry picked from commit 5141aeee5925ee0d1fa3d96ad2a422d3259a029b)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.