Contrail-control crash @ StaticRouteMgr<StaticRouteInet>::ProcessStaticRouteConfig

Bug #1533435 reported by Jeba Paulaiyan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Prakash Bailkeri
R2.21.x
Fix Committed
High
Prakash Bailkeri
R2.22.x
Fix Committed
High
Prakash Bailkeri
Trunk
Fix Committed
High
Prakash Bailkeri

Bug Description

In Juno Multi node Sanity, contrail-control crashed:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-control'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 StaticRouteMgr<StaticRouteInet>::ProcessStaticRouteConfig (this=0x7ff52c020d10) at controller/src/bgp/routing-instance/static_route.cc:741
741 controller/src/bgp/routing-instance/static_route.cc: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", line 63, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named 'libstdcxx'
(gdb) bt
#0 StaticRouteMgr<StaticRouteInet>::ProcessStaticRouteConfig (this=0x7ff52c020d10) at controller/src/bgp/routing-instance/static_route.cc:741
#1 0x000000000061770a in StaticRouteMgr<StaticRouteInet>::ResolvePendingStaticRouteConfig (this=<optimized out>)
    at controller/src/bgp/routing-instance/static_route.cc:751
#2 0x0000000000bf2a7f in operator() (this=<optimized out>) at /usr/include/boost/function/function_template.hpp:767
#3 TaskTrigger::WorkerTask::Run (this=0x7ff54802f720) at controller/src/base/task_trigger.cc:19
#4 0x0000000000beea3c in TaskImpl::execute (this=0x7ff598b33740) at controller/src/base/task.cc:253
#5 0x00007ff5a0102b3a in ?? () from /usr/lib/libtbb.so.2
#6 0x00007ff5a00fe816 in ?? () from /usr/lib/libtbb.so.2
#7 0x00007ff5a00fdf4b in ?? () from /usr/lib/libtbb.so.2
#8 0x00007ff5a00fa0ff in ?? () from /usr/lib/libtbb.so.2
#9 0x00007ff5a00fa2f9 in ?? () from /usr/lib/libtbb.so.2
#10 0x00007ff5a031e182 in start_thread (arg=0x7ff59679e700) at pthread_create.c:312
#11 0x00007ff59f3ef47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

Core file in 10.84.5.112:/cs-shared/bugs/1533435.1/nodei23_core.contrail-contro.3814.nodei23.1452632313.gz

Jeba Paulaiyan (jebap)
tags: added: sanity
Revision history for this message
Prakash Bailkeri (prakashmb) wrote :

Can you pls check the file permission on the core file. Unable to download the core file

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/16290
Submitter: Prakash Bailkeri (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/16291
Submitter: Prakash Bailkeri (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16290
Committed: http://github.org/Juniper/contrail-controller/commit/861864bee9578ce688cda1d59234ea6b755ef60f
Submitter: Zuul
Branch: R2.20

commit 861864bee9578ce688cda1d59234ea6b755ef60f
Author: Prakash M Bailkeri <email address hidden>
Date: Wed Jan 13 22:22:02 2016 -0800

Fix a corner case with routing instance delete

Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted

Resolve trigger calls ProcessStaticRouteConfig to apply any pending static
route config. ProcessStaticRouteConfig accesses the NULL config pointer of
the routing instance

Fix:
1. Check whether the routing instance is deleted in ProcessStaticRouteConfig
2. Reset the resolve_trigger_ in StaticRouteMgr destructor
3. Add API to disable resolve_trigger_ and Add UT to test delayed processing
of resolve_trigger_

Change-Id: Icb1b9bad340ccefc9fbab75188034ade79a6193a
Closes-bug: #1533435

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/16291
Committed: http://github.org/Juniper/contrail-controller/commit/fa454518a372bde543f97093df60fd7025c8e4c8
Submitter: Zuul
Branch: master

commit fa454518a372bde543f97093df60fd7025c8e4c8
Author: Prakash M Bailkeri <email address hidden>
Date: Wed Jan 13 22:31:23 2016 -0800

Fix a corner case with routing instance delete

Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted

Resolve trigger calls ProcessStaticRouteConfig to apply any pending static
route config. ProcessStaticRouteConfig accesses the NULL config pointer of
the routing instance

Fix:
1. Check whether the routing instance is deleted in ProcessStaticRouteConfig
2. Reset the resolve_trigger_ in StaticRouteMgr destructor
3. Add API to disable resolve_trigger_ and Add UT to test delayed processing
of resolve_trigger_

Change-Id: I3c8a513f1af949aa419594a4214200de73dccaf8
Closes-bug: #1533435

Nischal Sheth (nsheth)
tags: added: service-chain
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/16686
Submitter: Vinay Vithal Mahuli (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged
Download full text (8.5 KiB)

Reviewed: https://review.opencontrail.org/16686
Committed: http://github.org/Juniper/contrail-controller/commit/156ad0b760f9b532572116d813d7afa695555bea
Submitter: Zuul
Branch: R2.22.x

commit 156ad0b760f9b532572116d813d7afa695555bea
Author: Atul Moghe <email address hidden>
Date: Mon Dec 21 14:29:14 2015 -0800

Cherry pick controller commits from R2.20 to R2.22.x
updating version.info from 2.22 to 2.23 in 2.20 branch
Closes-Bug:#1528370

Change-Id: Ic649422979a926cc5f5b8457c01610b848dc206b

Storage stats daemon fix

Partial-Bug: #1528327
Fixed latency monitor code based on the Ceph 0.94.3 version.
Fixed issues in OSD throughput/IOPs calculation.
Updated code based on the latest Sandesh apis.

Change-Id: I12caf951f84c8b213b1b5ec01371bb68b4c48cb3

Fix contrail-collector back pressure mechanism

contrail-collector DB queue back presssure mechanism was not
working since the DB drop level is initialized to INVALID and
even the water marks levels are INVALID and hence the defer/undefer
callbacks are not called.

Change-Id: Ib28141a69aeed3c4ad6f50abbaed2a285e3e7db2
Partial-Bug: #1528380

Fix Agent crash for flow index tree management

Issue:
------
During a flow index change vrouter-agent triggers a delete
on index tree using new flow handle instead of currently
held flow_handle resulting in flow entry getting associated
to two slots in the flow index tree, which further on flow
entry delete due to aging or eviction never releases the
slot for old flow handle, causing failures for further
insertions in the flow index tree

Fix:
----
Avoid taking flow handle as argument to DeleteByIndex and
use the currently associated flow_handle to remove from tree
Adding assert in DeleteByIndex to catch delete failure
Avoid doing delete from index tree in code paths other than
flow entry index update of flow entry delete.

Add logic for KSync Sock User to Mock vrouter behavior
returning index for an entry if it is already allocated
instead of allocating a new one.

Closes-Bug: 1527425
Change-Id: I10e77fb59650acfdd924a5f1d35d6b8dea03a3f0

Fix discovery dependency issue. Originally made in master branch
via https://review.opencontrail.org/#/c/15749

Change-Id: I5d874de3714074c66fa73bfd7c9119772dc681fd
Partial-Bug: #1530186

Avoid calling get_routing_instances on VN object

Calling get_routing_instances could trigger another read of the VN
if the VN has no routing instance. This is not only inefficient, but
could also cause exception if the VN has disappeared. We can avoid
this by calling getattr.

Change-Id: Ie5500585b9e6c578576276c2c04ec03f32c75112
Partial-Bug: 1528950

Fix Centos 65 agent compilation issues.
Closes-Bug: #1532159

Change-Id: Ia8b77619c80737000d5bd949534c9e0a16967359

Closes-Bug: #1524063, contrail-status is showing contrail-web-ui, even it is not configured, in case of SMLite

Change-Id: I55afc19140b1ce52b3b529a644124705de5ce6a8

Fix a corner case with routing instance delete

Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted

Resolve trigger calls ProcessStaticRouteConfi...

Read more...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/17394
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17394
Committed: http://github.org/Juniper/contrail-controller/commit/e98fd9b77617346d759348528aa750758971a50a
Submitter: Zuul
Branch: R2.21.x

commit e98fd9b77617346d759348528aa750758971a50a
Author: Prakash M Bailkeri <email address hidden>
Date: Wed Jan 13 22:22:02 2016 -0800

Fix a corner case with routing instance delete

Sequence of event that causes the crash
1. Static route config deleted
2. Static Route maanger triggers resolve_trigger_ to re-evaluate static
route config
3. Before the resolve trigger is invoked routing instance is deleted

Resolve trigger calls ProcessStaticRouteConfig to apply any pending static
route config. ProcessStaticRouteConfig accesses the NULL config pointer of
the routing instance

Fix:
1. Check whether the routing instance is deleted in ProcessStaticRouteConfig
2. Reset the resolve_trigger_ in StaticRouteMgr destructor
3. Add API to disable resolve_trigger_ and Add UT to test delayed processing
of resolve_trigger_

Change-Id: Icb1b9bad340ccefc9fbab75188034ade79a6193a
Closes-bug: #1533435

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.