snat: agent crash upon snat delete @ ServiceInstanceTable::Delete

Bug #1473597 reported by Senthilnathan Murugappan
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Divakar Dharanalakota
R2.20.x
Fix Committed
High
Divakar Dharanalakota
Trunk
Fix Committed
High
Divakar Dharanalakota

Bug Description

Observed the below agent crash on R2.20-39. However not able to recreate it.

The core is copied to /cs-shared/bugs/<bugid>/

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000092d5ca in std::_Rb_tree<std::string, std::pair<std::string const, int>, std::_Select1st<std::pair<std::string const, int> >, std::less<std::string>, std::allocator<std::pair<std::string const, int> > >::find (this=this@entry=0x7f44cc001438, __k=...) at /usr/include/c++/4.8/bits/stl_tree.h:1792
1792 iterator __j = _M_lower_bound(_M_begin(), _M_end(), __k);
(gdb) bt
#0 0x000000000092d5ca in std::_Rb_tree<std::string, std::pair<std::string const, int>, std::_Select1st<std::pair<std::string const, int> >, std::less<std::string>, std::allocator<std::pair<std::string const, int> > >::find (this=this@entry=0x7f44cc001438, __k=...) at /usr/include/c++/4.8/bits/stl_tree.h:1792
#1 0x00000000009297c8 in find (__x=..., this=<optimized out>) at /usr/include/c++/4.8/bits/stl_map.h:822
#2 IFMapNodeGet (node=0x7f44bc0fff10, this=0x7f44cc001410) at controller/src/vnsw/agent/oper/ifmap_dependency_manager.cc:231
#3 IFMapDependencyManager::SetObject (this=0x7f44cc001410, node=0x7f44bc0fff10, entry=0x0) at controller/src/vnsw/agent/oper/ifmap_dependency_manager.cc:256
#4 0x000000000099ed65 in ServiceInstanceTable::Delete (this=<optimized out>, entry=<optimized out>, request=<optimized out>) at controller/src/vnsw/agent/oper/service_instance.cc:695
#5 0x0000000000ea0d4c in DBTable::Input (this=0x7f44cc019d00, tbl_partition=0x7f44cc019e30, client=<optimized out>, req=0x7f44cc1936b0) at controller/src/db/db_table.cc:325
#6 0x0000000000ea0576 in DBPartition::QueueRunner::Run (this=0x7f44cc174db0) at controller/src/db/db_partition.cc:187
#7 0x0000000000f9ed00 in TaskImpl::execute (this=0x7f44d394bc40) at controller/src/base/task.cc:232
#8 0x00007f44dacc0b3a in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f44dacbc816 in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f44dacbbf4b in ?? () from /usr/lib/libtbb.so.2
#11 0x00007f44dacb80ff in ?? () from /usr/lib/libtbb.so.2
#12 0x00007f44dacb82f9 in ?? () from /usr/lib/libtbb.so.2
#13 0x00007f44daedc182 in start_thread (arg=0x7f44d3778700) at pthread_create.c:312
#14 0x00007f44da1b547d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)

Tags: vrouter
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/12710
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/12710
Committed: http://github.org/Juniper/contrail-controller/commit/782d8bdf8abab027129d6c10248a475a4bb11ce0
Submitter: Zuul
Branch: R2.20

commit 782d8bdf8abab027129d6c10248a475a4bb11ce0
Author: Divakar <email address hidden>
Date: Wed Jul 29 21:42:06 2015 +0530

Handling service-instance reuse

When the neutron router is added and deleted continuously in a loop, the
service instance object in Agent is not properly handled. When service
instance is delete marked and re-add appears again, DB table invokes
OnChange on the DBEntry rather Add. Service instance code is not
handling this case and IFMapNode is set in DBEntry only as part
of ADD. This is leading to stale IFMapNode entry in DBEntry and leading
to crash when IFMap node graph is invoked as part of OnChange.

As a fix, even in OnChange, IFMapNode is set in DBEntry. As
service_instance is taking a intrusive pointer to IFMapNode, redundatnt
node_ member is removed from the object.

Also service_instance_test has been moved out of falky test.

Change-Id: Ifc84091e23c5bc1a367d852574db286751b041f0
closes-bug: #1473597
closes-bug: #1474273
partial-bug: #1465413

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/13171
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/13171
Committed: http://github.org/Juniper/contrail-controller/commit/1ca68b2fe083f72bceaf2f021ef1e837b3db62be
Submitter: Zuul
Branch: master

commit 1ca68b2fe083f72bceaf2f021ef1e837b3db62be
Author: Divakar <email address hidden>
Date: Wed Jul 29 21:42:06 2015 +0530

Handling service-instance reuse

When the neutron router is added and deleted continuously in a loop, the
service instance object in Agent is not properly handled. When service
instance is delete marked and re-add appears again, DB table invokes
OnChange on the DBEntry rather Add. Service instance code is not
handling this case and IFMapNode is set in DBEntry only as part
of ADD. This is leading to stale IFMapNode entry in DBEntry and leading
to crash when IFMap node graph is invoked as part of OnChange.

As a fix, even in OnChange, IFMapNode is set in DBEntry. As
service_instance is taking a intrusive pointer to IFMapNode, redundatnt
node_ member is removed from the object.

Also service_instance_test has been moved out of falky test.

closes-bug: #1473597
closes-bug: #1474273
partial-bug: #1465413

Conflicts:

 src/vnsw/agent/oper/service_instance.h

Change-Id: I56880a0d2b82f310c351ba292b7a54098b6ea880

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22-dev

Review in progress for https://review.opencontrail.org/13927
Submitter: Vinay Vithal Mahuli (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20.x

Review in progress for https://review.opencontrail.org/14249
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/14249
Committed: http://github.org/Juniper/contrail-controller/commit/9ff444b32301db684a149fd603a645f9f1b5e1af
Submitter: Zuul
Branch: R2.20.x

commit 9ff444b32301db684a149fd603a645f9f1b5e1af
Author: Divakar <email address hidden>
Date: Wed Jul 29 21:42:06 2015 +0530

Handling service-instance reuse

When the neutron router is added and deleted continuously in a loop, the
service instance object in Agent is not properly handled. When service
instance is delete marked and re-add appears again, DB table invokes
OnChange on the DBEntry rather Add. Service instance code is not
handling this case and IFMapNode is set in DBEntry only as part
of ADD. This is leading to stale IFMapNode entry in DBEntry and leading
to crash when IFMap node graph is invoked as part of OnChange.

As a fix, even in OnChange, IFMapNode is set in DBEntry. As
service_instance is taking a intrusive pointer to IFMapNode, redundatnt
node_ member is removed from the object.

Also service_instance_test has been moved out of falky test.

Change-Id: Ifc84091e23c5bc1a367d852574db286751b041f0
closes-bug: #1473597
closes-bug: #1474273
partial-bug: #1465413

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.