vrouter crash at FlowMgmtManager::EnqueueUveAddEvent

Bug #1641833 reported by Shashikiran H
52
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Ashok Singh
R3.0.3.x
Fix Committed
High
Ashok Singh
R3.1
Fix Committed
High
Ashok Singh
R3.2
Fix Committed
High
Ashok Singh
Trunk
Fix Committed
High
Ashok Singh

Bug Description

Version: 3.2.0.0-3~mitaka

Have a mirroring config. I sent traffic and stopped it and then saw this crash. This is not reproducible.

root@nodeg8:~# gdb /usr/bin/contrail-vrouter-agent /var/crashes/core.contrail-vroute.1819.nodeg8.1479186370
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/contrail-vrouter-agent...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 1910]
[New LWP 1819]
[New LWP 1906]
[New LWP 7345]
[New LWP 1904]
[New LWP 1905]
[New LWP 1907]
[New LWP 1908]
[New LWP 1909]
[New LWP 1903]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:100
100 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:100
#1 0x00007f4ab22cae30 in std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007f4ab22cb48c in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x0000000000cdb9c0 in FlowMgmtManager::EnqueueUveAddEvent(FlowEntry const*) const ()
#4 0x0000000000cde118 in FlowMgmtManager::RequestHandler(boost::shared_ptr<FlowMgmtRequest>) ()
#5 0x0000000000ce056c in boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<bool, boost::_mfi::mf1<bool, FlowMgmtManager, boost::shared_ptr<FlowMgmtRequest> >, boost::_bi::list2<boost::_bi::value<FlowMgmtManager*>, boost::arg<1> > >, bool, boost::shared_ptr<FlowMgmtRequest> >::invoke(boost::detail::function::function_buffer&, boost::shared_ptr<FlowMgmtRequest>) ()
#6 0x0000000000ce6144 in QueueTaskRunner<boost::shared_ptr<FlowMgmtRequest>, WorkQueue<boost::shared_ptr<FlowMgmtRequest> > >::RunQueue() ()
#7 0x00000000012cd72f in TaskImpl::execute() ()
#8 0x00007f4ab2534b3a in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f4ab2530816 in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f4ab252ff4b in ?? () from /usr/lib/libtbb.so.2
#11 0x00007f4ab252c0ff in ?? () from /usr/lib/libtbb.so.2
#12 0x00007f4ab252c2f9 in ?? () from /usr/lib/libtbb.so.2
#13 0x00007f4ab2750182 in start_thread (arg=0x7f4aa91c1700) at pthread_create.c:312
#14 0x00007f4ab1a2947d in setfsuid () at ../sysdeps/unix/syscall-template.S:81
#15 0x0000000000000000 in ?? ()

Tags: vrouter
Revision history for this message
Shashikiran H (skiranh) wrote :

A new core with similar bt
(gdb) bt
#0 0x00007ff67a1eec37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ff67a1f2028 in __GI_abort () at abort.c:89
#2 0x00007ff67a22b2a4 in __libc_message (do_abort=1, fmt=fmt@entry=0x7ff67a3396b0 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007ff67a238e26 in malloc_printerr (ptr=0x7ff654265d30, str=0x7ff67a335882 "malloc(): memory corruption",
    action=<optimized out>) at malloc.c:4996
#4 _int_malloc (av=0x7ff654000020, bytes=96) at malloc.c:3447
#5 0x00007ff67a23a6c0 in __GI___libc_malloc (bytes=96) at malloc.c:2891
#6 0x00007ff67aaf7dad in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x0000000000d15839 in WorkQueue<boost::shared_ptr<FlowAceStatsRequest> >::MayBeStartRunner() ()
#8 0x0000000000d15f30 in WorkQueue<boost::shared_ptr<FlowAceStatsRequest> >::Enqueue(boost::shared_ptr<FlowAceStatsRequest>) ()
#9 0x0000000000d1389a in StatsManager::EnqueueEvent(boost::shared_ptr<FlowAceStatsRequest> const&) ()
#10 0x0000000000cdba35 in FlowMgmtManager::EnqueueUveAddEvent(FlowEntry const*) const ()
#11 0x0000000000cde118 in FlowMgmtManager::RequestHandler(boost::shared_ptr<FlowMgmtRequest>) ()
#12 0x0000000000ce056c in boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<bool, boost::_mfi::mf1<bool, FlowMgmtManager, boost::shared_ptr<FlowMgmtRequest> >, boost::_bi::list2<boost::_bi::value<FlowMgmtManager*>, boost::arg<1> > >, bool, boost::shared_ptr<FlowMgmtRequest> >::invoke(boost::detail::function::function_buffer&, boost::shared_ptr<FlowMgmtRequest>) ()
#13 0x0000000000ce6144 in QueueTaskRunner<boost::shared_ptr<FlowMgmtRequest>, WorkQueue<boost::shared_ptr<FlowMgmtRequest> > >::RunQueue() ()
#14 0x00000000012cd72f in TaskImpl::execute() ()
#15 0x00007ff67adbdb3a in ?? () from /usr/lib/libtbb.so.2
#16 0x00007ff67adb9816 in ?? () from /usr/lib/libtbb.so.2
#17 0x00007ff67adb8f4b in ?? () from /usr/lib/libtbb.so.2
#18 0x00007ff67adb50ff in ?? () from /usr/lib/libtbb.so.2
#19 0x00007ff67adb52f9 in ?? () from /usr/lib/libtbb.so.2
#20 0x00007ff67afd9184 in start_thread (arg=0x7ff67224c700) at pthread_create.c:312
#21 0x00007ff67a2b237d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Revision history for this message
Hari Prasad Killi (haripk) wrote :

Corruption is seen in the cores. Couldnt recreate and testing with valgrind hasnt yielded any given any clue. Continuing valgrind tests.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/26559
Committed: http://github.org/Juniper/contrail-controller/commit/3def48da306377d9a4e60d2d24dd2d7b99874d19
Submitter: Zuul (<email address hidden>)
Branch: master

commit 3def48da306377d9a4e60d2d24dd2d7b99874d19
Author: ashoksingh <email address hidden>
Date: Tue Nov 29 17:31:49 2016 +0530

Fix valgrind errors for agent

Fixed following errors
1. Use of delete instead of delete[]
2. Use of unitialized variables in couple of places.

Change-Id: I92025bd395cc7a07c789bd8efc63305c3bdf408b
Partial-Bug: #1641833

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/26612
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/26612
Committed: http://github.org/Juniper/contrail-controller/commit/edb3fb3e03112417c4d5bb9be054cdc33e35cb98
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit edb3fb3e03112417c4d5bb9be054cdc33e35cb98
Author: ashoksingh <email address hidden>
Date: Tue Nov 29 17:31:49 2016 +0530

Fix valgrind errors for agent

Fixed following errors
1. Use of delete instead of delete[]
2. Use of unitialized variables in couple of places.

Partial-Bug: #1641833
(cherry picked from commit 3def48da306377d9a4e60d2d24dd2d7b99874d19)

Change-Id: I8edac4497a673f9d923a47bfb94fb7ff66bb3d23

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/29472
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/29472
Committed: http://github.org/Juniper/contrail-controller/commit/006c5e6955305dcfda5c3a3b5687b39d171b13f1
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 006c5e6955305dcfda5c3a3b5687b39d171b13f1
Author: ashoksingh <email address hidden>
Date: Thu Mar 9 13:50:28 2017 +0530

Add missing locks in Agent UVE code

(1)Acquire lock in VnUveEntry::ClearInterVnStats to prevent parallel access to
inter_vn_stats_ between kTaskFlowStatsCollector and kTaskDBExclude
(2)Acquire lock in InterfaceUveStatsTable::FipEntry to prevent parallel access to
interface_tree_ between kTaskFlowStatsCollector and kTaskDBExclude

Change-Id: I66c0a77fb5947946e3ac449ea28452158e0aac80
Partial-Bug: #1641833

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/29496
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/29497
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/29498
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0.3.x

Review in progress for https://review.opencontrail.org/29499
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/29497
Committed: http://github.org/Juniper/contrail-controller/commit/3bde5dda74a7c282f7c30bd431c21ea1719fe4e8
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 3bde5dda74a7c282f7c30bd431c21ea1719fe4e8
Author: ashoksingh <email address hidden>
Date: Thu Mar 9 13:50:28 2017 +0530

Add missing locks in Agent UVE code

(1)Acquire lock in VnUveEntry::ClearInterVnStats to prevent parallel access to
inter_vn_stats_ between kTaskFlowStatsCollector and kTaskDBExclude
(2)Acquire lock in InterfaceUveStatsTable::FipEntry to prevent parallel access to
interface_tree_ between kTaskFlowStatsCollector and kTaskDBExclude

Partial-Bug: #1641833
(cherry picked from commit 006c5e6955305dcfda5c3a3b5687b39d171b13f1)

Change-Id: Iabb51db41d37ac0a517241214573707ef910acb3

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/29496
Committed: http://github.org/Juniper/contrail-controller/commit/a24aa517f575482badd1b5bf56520c7e6975c7e0
Submitter: Zuul (<email address hidden>)
Branch: master

commit a24aa517f575482badd1b5bf56520c7e6975c7e0
Author: ashoksingh <email address hidden>
Date: Thu Mar 9 13:50:28 2017 +0530

Add missing locks in Agent UVE code

(1)Acquire lock in VnUveEntry::ClearInterVnStats to prevent parallel access to
inter_vn_stats_ between kTaskFlowStatsCollector and kTaskDBExclude
(2)Acquire lock in InterfaceUveStatsTable::FipEntry to prevent parallel access to
interface_tree_ between kTaskFlowStatsCollector and kTaskDBExclude

Partial-Bug: #1641833
(cherry picked from commit 006c5e6955305dcfda5c3a3b5687b39d171b13f1)

Change-Id: I5a319892421664c5be23d6a2dc228e6c94271d3a

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/29498
Committed: http://github.org/Juniper/contrail-controller/commit/c6b6d1f075d132838227ee54ae5f1a96fa9d754e
Submitter: Zuul (<email address hidden>)
Branch: R3.0

commit c6b6d1f075d132838227ee54ae5f1a96fa9d754e
Author: ashoksingh <email address hidden>
Date: Thu Mar 9 13:50:28 2017 +0530

Add missing locks in Agent UVE code

(1)Acquire lock in VnUveEntry::ClearInterVnStats to prevent parallel access to
inter_vn_stats_ between kTaskFlowStatsCollector and kTaskDBExclude
(2)Acquire lock in InterfaceUveStatsTable::FipEntry to prevent parallel access to
interface_tree_ between kTaskFlowStatsCollector and kTaskDBExclude

Partial-Bug: #1641833
(cherry picked from commit 006c5e6955305dcfda5c3a3b5687b39d171b13f1)

Change-Id: I0b6c57bbd3d66f561168ca1ebb4b4c23e9ffac4f

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/29499
Committed: http://github.org/Juniper/contrail-controller/commit/6fd4e4aca8a77354fec9b8d6ac6ed2d0fa3c6f70
Submitter: Zuul (<email address hidden>)
Branch: R3.0.3.x

commit 6fd4e4aca8a77354fec9b8d6ac6ed2d0fa3c6f70
Author: ashoksingh <email address hidden>
Date: Thu Mar 9 13:50:28 2017 +0530

Add missing locks in Agent UVE code

(1)Acquire lock in VnUveEntry::ClearInterVnStats to prevent parallel access to
inter_vn_stats_ between kTaskFlowStatsCollector and kTaskDBExclude
(2)Acquire lock in InterfaceUveStatsTable::FipEntry to prevent parallel access to
interface_tree_ between kTaskFlowStatsCollector and kTaskDBExclude

Partial-Bug: #1641833
(cherry picked from commit 006c5e6955305dcfda5c3a3b5687b39d171b13f1)

Change-Id: I750d98105daf5b86ac4d399901a6cfac7bb97c24

Revision history for this message
Shashikiran H (skiranh) wrote :

Not seeing this on 3.2.1.0-26. Was not reproducible in the first place, so will close this bug and reopen a new one for pending issues.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/29756
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/29757
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/29756
Committed: http://github.org/Juniper/contrail-controller/commit/28bd27756e12d115920c594e9338d6abb53b8216
Submitter: Zuul (<email address hidden>)
Branch: master

commit 28bd27756e12d115920c594e9338d6abb53b8216
Author: ashoksingh <email address hidden>
Date: Tue Mar 21 15:52:21 2017 +0530

Acquire flow locks in FlowMgmt task

Locks are required since we are accessing fields of flow in EnqueueUveAddEvent
and EnqueueUveDeleteEvent

Change-Id: I92e6af73f68c120c9b38e13dec861a5b3172a871
Partial-Bug: #1641833

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/29757
Committed: http://github.org/Juniper/contrail-controller/commit/f51bae2ad3f2f576884d6983214a82253300955e
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit f51bae2ad3f2f576884d6983214a82253300955e
Author: ashoksingh <email address hidden>
Date: Tue Mar 21 15:52:21 2017 +0530

Acquire flow locks in FlowMgmt task

Locks are required since we are accessing fields of flow in EnqueueUveAddEvent
and EnqueueUveDeleteEvent

Partial-Bug: #1641833
(cherry picked from commit 28bd27756e12d115920c594e9338d6abb53b8216)

Change-Id: I11c208167a0355cdd94b99170d43ce51d9c993e8

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/29826
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/29827
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
Ashok Singh (ashoksr) wrote :
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/29826
Committed: http://github.org/Juniper/contrail-controller/commit/e9ac5c8eb610028cf7a16c4caa07d4f82d4d5a61
Submitter: Zuul (<email address hidden>)
Branch: R3.0

commit e9ac5c8eb610028cf7a16c4caa07d4f82d4d5a61
Author: ashoksingh <email address hidden>
Date: Tue Mar 21 15:52:21 2017 +0530

Acquire flow locks in FlowMgmt task

Locks are required since we are accessing fields of flow in EnqueueUveAddEvent
and EnqueueUveDeleteEvent

Partial-Bug: #1641833
(cherry picked from commit 28bd27756e12d115920c594e9338d6abb53b8216)

Change-Id: I0bfabe464ba53d2a20ee360318fad41f79483e92

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/29827
Committed: http://github.org/Juniper/contrail-controller/commit/95ebf432ed21351859f862fc919a53d7365a01c4
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 95ebf432ed21351859f862fc919a53d7365a01c4
Author: ashoksingh <email address hidden>
Date: Tue Mar 21 15:52:21 2017 +0530

Acquire flow locks in FlowMgmt task

Locks are required since we are accessing fields of flow in EnqueueUveAddEvent
and EnqueueUveDeleteEvent

Partial-Bug: #1641833
(cherry picked from commit 28bd27756e12d115920c594e9338d6abb53b8216)

Change-Id: Ib914d9c107e5e7385f2b154aaf8f9b90531e0be3

Revision history for this message
Ashok Singh (ashoksr) wrote :

Fix not required in R3.0.3.x branch as the required locks of flow already exist. EnqueueUveAddEvent and EnqueueUveDeleteEvent are invoked from task which acquires locks before invoking these APIs.

Revision history for this message
Ashok Singh (ashoksr) wrote :

For R3.0.3.x branch flow locks are still required to guard AddFlow API invoked from FlowMgmtManager::RequestHandler.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0.3.x

Review in progress for https://review.opencontrail.org/32259
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/32259
Committed: http://github.com/Juniper/contrail-controller/commit/99c5b070ada07eb78a4fc32b730fe0ae8670652f
Submitter: Zuul (<email address hidden>)
Branch: R3.0.3.x

commit 99c5b070ada07eb78a4fc32b730fe0ae8670652f
Author: ashoksingh <email address hidden>
Date: Tue May 30 13:33:24 2017 +0530

Acquire flow locks in FlowMgmt task

AddFlow API invoked from FlowMgmtManager::RequestHandler accesses fields of flow which is now done
under flow locks
Also, moved EnqueueUveAddEvent/EnqueueUveDeleteEvent from from FlowMgmt instance 1 to instance 0 task
to make it consistent with changes in other branches.

Change-Id: Ic2ad37e27fb3d1a873dbe4eab5f08852dbe04870
Closes-Bug: #1641833

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.