vrouter core on build119 of R2.23 on centos71

Bug #1535040 reported by Sudheendra Rao
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Invalid
High
Naveen N
R2.20
Fix Committed
High
Naveen N
R2.22.x
Fix Committed
High
Naveen N

Bug Description

vrouter core observed on R2.23 build119 on centos71 during sanity run.
The core location is:
/cs-shared/test_runs/nodeb8/2016_01_08_03_57_22

The backtrace of the core is:
(gdb) bt
#0 0x00002b8543769784 in std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) () from /lib64/libstdc++.so.6
#1 0x00000000016721e7 in std::_Rb_tree<FlowKey, std::pair<FlowKey const, FlowEntry*>, std::_Select1st<std::pair<FlowKey const, FlowEntry*> >, Inet4FlowKeyCmp, std::allocator<std::pair<FlowKey const, FlowEntry*> > >::_M_erase_aux(std::_Rb_tree_const_iterator<std::pair<FlowKey const, FlowEntry*> >) ()
#2 0x00000000016718a8 in std::_Rb_tree<FlowKey, std::pair<FlowKey const, FlowEntry*>, std::_Select1st<std::pair<FlowKey const, FlowEntry*> >, Inet4FlowKeyCmp, std::allocator<std::pair<FlowKey const, FlowEntry*> > >::erase(std::_Rb_tree_iterator<std::pair<FlowKey const, FlowEntry*> >) ()
#3 0x0000000001670d09 in std::map<FlowKey, FlowEntry*, Inet4FlowKeyCmp, std::allocator<std::pair<FlowKey const, FlowEntry*> > >::erase(std::_Rb_tree_iterator<std::pair<FlowKey const, FlowEntry*> >) ()
#4 0x000000000166ec83 in intrusive_ptr_release(FlowEntry*) ()
#5 0x0000000001670c6f in boost::intrusive_ptr<FlowEntry>::~intrusive_ptr() ()
#6 0x0000000001677054 in FlowExportReq::~FlowExportReq() ()
#7 0x000000000167cd43 in void boost::checked_delete<FlowExportReq>(FlowExportReq*) ()
#8 0x000000000167f65a in boost::detail::sp_counted_impl_p<FlowExportReq>::dispose() ()
#9 0x00000000011f736a in boost::detail::sp_counted_base::release() ()
#10 0x00000000011f742d in boost::detail::shared_count::~shared_count() ()
#11 0x00000000016726c4 in boost::shared_ptr<FlowExportReq>::~shared_ptr() ()
#12 0x0000000001674965 in boost::shared_ptr<FlowExportReq>::operator=(boost::shared_ptr<FlowExportReq> const&) ()
#13 0x0000000001674380 in tbb::strict_ppl::internal::micro_queue<boost::shared_ptr<FlowExportReq> >::assign_and_destroy_item(void*, tbb::strict_ppl::internal::concurrent_queue_rep_base::page&, unsigned long) ()
#14 0x0000000001673cb6 in tbb::strict_ppl::internal::micro_queue<boost::shared_ptr<FlowExportReq> >::pop(void*, unsigned long, tbb::strict_ppl::internal::concurrent_queue_base_v3<boost::shared_ptr<FlowExportReq> >&) ()
#15 0x000000000167347e in tbb::strict_ppl::internal::concurrent_queue_base_v3<boost::shared_ptr<FlowExportReq> >::internal_try_pop(void*) ()
#16 0x00000000016807e5 in tbb::strict_ppl::concurrent_queue<boost::shared_ptr<FlowExportReq>, tbb::cache_aligned_allocator<boost::shared_ptr<FlowExportReq> > >::try_pop(boost::shared_ptr<FlowExportReq>&)
    ()
---Type <return> to continue, or q <return> to quit---
#17 0x00000000016803a9 in WorkQueue<boost::shared_ptr<FlowExportReq> >::DequeueInternal(boost::shared_ptr<FlowExportReq>*) ()
#18 0x000000000167ff86 in WorkQueue<boost::shared_ptr<FlowExportReq> >::Dequeue(boost::shared_ptr<FlowExportReq>*) ()
#19 0x000000000167facf in QueueTaskRunner<boost::shared_ptr<FlowExportReq>, WorkQueue<boost::shared_ptr<FlowExportReq> > >::RunQueue() ()
#20 0x000000000167f63c in QueueTaskRunner<boost::shared_ptr<FlowExportReq>, WorkQueue<boost::shared_ptr<FlowExportReq> > >::Run() ()
#21 0x0000000001d7c3d2 in TaskImpl::execute() ()
#22 0x00002b85434e096a in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) () from /lib64/libtbb.so.2
#23 0x00002b85434dc5a6 in tbb::internal::arena::process(tbb::internal::generic_scheduler&) ()
   from /lib64/libtbb.so.2
#24 0x00002b85434dbc6b in tbb::internal::market::process(rml::job&) () from /lib64/libtbb.so.2
#25 0x00002b85434d965f in tbb::internal::rml::private_worker::run() () from /lib64/libtbb.so.2
#26 0x00002b85434d9859 in tbb::internal::rml::private_worker::thread_routine(void*) ()
   from /lib64/libtbb.so.2
#27 0x00002b85432aadf5 in start_thread () from /lib64/libpthread.so.0
#28 0x00002b85440091ad in clone () from /lib64/libc.so.6
(gdb)

Tags: vrouter
Revision history for this message
Prabhjot Singh Sethi (prabhjot) wrote :

Issue happens due to flow entry tree manipulation being done from two threads in parallel, causing tree corruption

FlowExportReq destructor in thread context of stats collector can try to delete a flow entry from the table where flow handler can try adding/deleting it

Changed in juniperopenstack:
assignee: Prabhjot Singh Sethi (prabhjot) → nobody
assignee: nobody → Naveen N (naveenn)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/16472
Submitter: Naveen N (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16472
Committed: http://github.org/Juniper/contrail-controller/commit/219c037222cb243ea4e278f031eb51826b657f1e
Submitter: Zuul
Branch: R2.20

commit 219c037222cb243ea4e278f031eb51826b657f1e
Author: Naveen N <email address hidden>
Date: Mon Jan 25 11:08:18 2016 +0530

* Add exclusion between flow table and flow stats collector
Reference for flow entry can be release by flow stats collector
resulting in flow being deleted from flow tree and parallel
modification of flow tree from flow table and flow stats collector
context. Fixing the same.
Closes-bug:#1535040

Change-Id: I8e7c18aaacbe1ed16639917dc51480af55b2da86

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/16686
Submitter: Vinay Vithal Mahuli (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged
Download full text (8.5 KiB)

Reviewed: https://review.opencontrail.org/16686
Committed: http://github.org/Juniper/contrail-controller/commit/156ad0b760f9b532572116d813d7afa695555bea
Submitter: Zuul
Branch: R2.22.x

commit 156ad0b760f9b532572116d813d7afa695555bea
Author: Atul Moghe <email address hidden>
Date: Mon Dec 21 14:29:14 2015 -0800

Cherry pick controller commits from R2.20 to R2.22.x
updating version.info from 2.22 to 2.23 in 2.20 branch
Closes-Bug:#1528370

Change-Id: Ic649422979a926cc5f5b8457c01610b848dc206b

Storage stats daemon fix

Partial-Bug: #1528327
Fixed latency monitor code based on the Ceph 0.94.3 version.
Fixed issues in OSD throughput/IOPs calculation.
Updated code based on the latest Sandesh apis.

Change-Id: I12caf951f84c8b213b1b5ec01371bb68b4c48cb3

Fix contrail-collector back pressure mechanism

contrail-collector DB queue back presssure mechanism was not
working since the DB drop level is initialized to INVALID and
even the water marks levels are INVALID and hence the defer/undefer
callbacks are not called.

Change-Id: Ib28141a69aeed3c4ad6f50abbaed2a285e3e7db2
Partial-Bug: #1528380

Fix Agent crash for flow index tree management

Issue:
------
During a flow index change vrouter-agent triggers a delete
on index tree using new flow handle instead of currently
held flow_handle resulting in flow entry getting associated
to two slots in the flow index tree, which further on flow
entry delete due to aging or eviction never releases the
slot for old flow handle, causing failures for further
insertions in the flow index tree

Fix:
----
Avoid taking flow handle as argument to DeleteByIndex and
use the currently associated flow_handle to remove from tree
Adding assert in DeleteByIndex to catch delete failure
Avoid doing delete from index tree in code paths other than
flow entry index update of flow entry delete.

Add logic for KSync Sock User to Mock vrouter behavior
returning index for an entry if it is already allocated
instead of allocating a new one.

Closes-Bug: 1527425
Change-Id: I10e77fb59650acfdd924a5f1d35d6b8dea03a3f0

Fix discovery dependency issue. Originally made in master branch
via https://review.opencontrail.org/#/c/15749

Change-Id: I5d874de3714074c66fa73bfd7c9119772dc681fd
Partial-Bug: #1530186

Avoid calling get_routing_instances on VN object

Calling get_routing_instances could trigger another read of the VN
if the VN has no routing instance. This is not only inefficient, but
could also cause exception if the VN has disappeared. We can avoid
this by calling getattr.

Change-Id: Ie5500585b9e6c578576276c2c04ec03f32c75112
Partial-Bug: 1528950

Fix Centos 65 agent compilation issues.
Closes-Bug: #1532159

Change-Id: Ia8b77619c80737000d5bd949534c9e0a16967359

Closes-Bug: #1524063, contrail-status is showing contrail-web-ui, even it is not configured, in case of SMLite

Change-Id: I55afc19140b1ce52b3b529a644124705de5ce6a8

Fix a corner case with routing instance delete

Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted

Resolve trigger calls ProcessStaticRouteConfi...

Read more...

Revision history for this message
Hari Prasad Killi (haripk) wrote :

Issue not relevant to mainline

Changed in juniperopenstack:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.