On a 5.0 DPDK-Mellanox Cluster, agent crashed in tbb::internal::IntelSchedulerTraits
Bt is as below
gdb) bt
#0 0x00007f0e920731f7 in raise () from /lib64/libc.so.6
#1 0x00007f0e920748e8 in abort () from /lib64/libc.so.6
#2 0x00007f0e9206c266 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f0e9206c312 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000012d28d8 in NhDecode(Agent const*, NextHop const*, PktInfo const*, PktFlowInfo*, PktControlInfo*, PktControlInfo*, bool, EcmpLoadBalance const&) ()
#5 0x00000000012d420b in PktFlowInfo::EgressProcess(PktInfo const*, PktControlInfo*, PktControlInfo*) ()
#6 0x00000000012d5630 in PktFlowInfo::Process(PktInfo const*, PktControlInfo*, PktControlInfo*) ()
#7 0x00000000012e9005 in FlowHandler::Run() ()
#8 0x00000000012e2dad in Proto::RunProtoHandler(ProtoHandler*) ()
#9 0x00000000012c0c71 in FlowProto::FlowEventHandler(FlowEvent*, FlowTable*) ()
#10 0x00000000012e68dd in FlowEventQueueBase::Handler(FlowEvent*) ()
#11 0x00000000012c632f in QueueTaskRunner<FlowEvent*, WorkQueue<FlowEvent*> >::RunQueue() ()
#12 0x0000000000e8c82f in TaskImpl::execute() ()
#13 0x00007f0e92c428ca in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) () from /lib64/libtbb.so.2
#14 0x00007f0e92c3e5b6 in tbb::internal::arena::process(tbb::internal::generic_scheduler&) () from /lib64/libtbb.so.2
#15 0x00007f0e92c3dc8b in tbb::internal::market::process(rml::job&) () from /lib64/libtbb.so.2
#16 0x00007f0e92c3b67f in tbb::internal::rml::private_worker::run() () from /lib64/libtbb.so.2
#17 0x00007f0e92c3b879 in tbb::internal::rml::private_worker::thread_routine(void*) () from /lib64/libtbb.so.2
#18 0x00007f0e92e5de25 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f0e9213634d in clone () from /lib64/libc.so.6
(gdb) info threads
Id Target Id Frame
11 Thread 0x7f0e8a70e700 (LWP 17320) 0x00007f0e92e64a9b in recv () from /lib64/libpthread.so.0
10 Thread 0x7f0e8b311700 (LWP 17317) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
9 Thread 0x7f0e89c0d700 (LWP 20330) 0x00007f0e92c42252 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::receive_or_steal_task(long&, bool) () from /lib64/libtbb.so.2
8 Thread 0x7f0e8c315700 (LWP 17313) 0x00007f0e921307f9 in syscall () from /lib64/libc.so.6
7 Thread 0x7f0e8ab0f700 (LWP 17319) 0x00007f0e921307f9 in syscall () from /lib64/libc.so.6
6 Thread 0x7f0e8bf14700 (LWP 17314) 0x00007f0e92e6470d in read () from /lib64/libpthread.so.0
5 Thread 0x7f0e8b712700 (LWP 17315) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
4 Thread 0x7f0e8980c700 (LWP 14006) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
3 Thread 0x7f0e8af10700 (LWP 17318) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
2 Thread 0x7f0e9587b8c0 (LWP 17274) 0x00007f0e92136923 in epoll_wait () from /lib64/libc.so.6
* 1 Thread 0x7f0e8bb13700 (LWP 17316) 0x00007f0e920731f7 in raise () from /lib64/libc.so.6
(gdb)
On a 5.0 DPDK-Mellanox Cluster, agent crashed in tbb::internal: :IntelScheduler Traits
Bt is as below
gdb) bt :EgressProcess( PktInfo const*, PktControlInfo*, PktControlInfo*) () :Process( PktInfo const*, PktControlInfo*, PktControlInfo*) () RunProtoHandler (ProtoHandler* ) () :FlowEventHandl er(FlowEvent* , FlowTable*) () ase::Handler( FlowEvent* ) () <FlowEvent* , WorkQueue< FlowEvent* > >::RunQueue() () :custom_ scheduler< tbb::internal: :IntelScheduler Traits> ::local_ wait_for_ all(tbb: :task&, tbb::task*) () from /lib64/libtbb.so.2 :arena: :process( tbb::internal: :generic_ scheduler& ) () from /lib64/libtbb.so.2 :market: :process( rml::job& ) () from /lib64/libtbb.so.2 :rml::private_ worker: :run() () from /lib64/libtbb.so.2 :rml::private_ worker: :thread_ routine( void*) () from /lib64/libtbb.so.2 libpthread. so.0 libpthread. so.0 :custom_ scheduler< tbb::internal: :IntelScheduler Traits> ::receive_ or_steal_ task(long& , bool) () from /lib64/libtbb.so.2 libpthread. so.0
#0 0x00007f0e920731f7 in raise () from /lib64/libc.so.6
#1 0x00007f0e920748e8 in abort () from /lib64/libc.so.6
#2 0x00007f0e9206c266 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f0e9206c312 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000012d28d8 in NhDecode(Agent const*, NextHop const*, PktInfo const*, PktFlowInfo*, PktControlInfo*, PktControlInfo*, bool, EcmpLoadBalance const&) ()
#5 0x00000000012d420b in PktFlowInfo:
#6 0x00000000012d5630 in PktFlowInfo:
#7 0x00000000012e9005 in FlowHandler::Run() ()
#8 0x00000000012e2dad in Proto::
#9 0x00000000012c0c71 in FlowProto:
#10 0x00000000012e68dd in FlowEventQueueB
#11 0x00000000012c632f in QueueTaskRunner
#12 0x0000000000e8c82f in TaskImpl::execute() ()
#13 0x00007f0e92c428ca in tbb::internal:
#14 0x00007f0e92c3e5b6 in tbb::internal:
#15 0x00007f0e92c3dc8b in tbb::internal:
#16 0x00007f0e92c3b67f in tbb::internal:
#17 0x00007f0e92c3b879 in tbb::internal:
#18 0x00007f0e92e5de25 in start_thread () from /lib64/
#19 0x00007f0e9213634d in clone () from /lib64/libc.so.6
(gdb) info threads
Id Target Id Frame
11 Thread 0x7f0e8a70e700 (LWP 17320) 0x00007f0e92e64a9b in recv () from /lib64/
10 Thread 0x7f0e8b311700 (LWP 17317) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
9 Thread 0x7f0e89c0d700 (LWP 20330) 0x00007f0e92c42252 in tbb::internal:
8 Thread 0x7f0e8c315700 (LWP 17313) 0x00007f0e921307f9 in syscall () from /lib64/libc.so.6
7 Thread 0x7f0e8ab0f700 (LWP 17319) 0x00007f0e921307f9 in syscall () from /lib64/libc.so.6
6 Thread 0x7f0e8bf14700 (LWP 17314) 0x00007f0e92e6470d in read () from /lib64/
5 Thread 0x7f0e8b712700 (LWP 17315) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
4 Thread 0x7f0e8980c700 (LWP 14006) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
3 Thread 0x7f0e8af10700 (LWP 17318) 0x00007f0e9211ae47 in sched_yield () from /lib64/libc.so.6
2 Thread 0x7f0e9587b8c0 (LWP 17274) 0x00007f0e92136923 in epoll_wait () from /lib64/libc.so.6
* 1 Thread 0x7f0e8bb13700 (LWP 17316) 0x00007f0e920731f7 in raise () from /lib64/libc.so.6
(gdb)