contrail-controller : crash in ResolverPath::UpdateResolvedPaths

Bug #1622618 reported by vageesan
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Nischal Sheth
R3.1
Fix Committed
High
Nischal Sheth
R3.2
Fix Committed
High
Nischal Sheth
Trunk
Fix Committed
High
Nischal Sheth

Bug Description

contrail-control crashed with following backtrace.

version: 3.1.1.0-32

(gdb) bt
#0 ResolverPath::UpdateResolvedPaths (this=this@entry=0x7f0ab41871f0)
    at controller/src/bgp/routing-instance/path_resolver.cc:991
#1 0x00000000005f8f6c in PathResolverPartition::ProcessResolverPathUpdateList (this=0x7f0ae00f6160)
    at controller/src/bgp/routing-instance/path_resolver.cc:739
#2 0x00000000006b44b7 in operator() (this=<optimized out>)
    at /usr/include/boost/function/function_template.hpp:767
#3 TaskTrigger::WorkerTask::Run (this=0x7f0b48037ea0) at controller/src/base/task_trigger.cc:22
#4 0x00000000006af19f in TaskImpl::execute (this=0x7f0b5a546a40) at controller/src/base/task.cc:262
#5 0x00007f0b61b66b3a in ?? () from /usr/lib/libtbb.so.2
#6 0x00007f0b61b62816 in ?? () from /usr/lib/libtbb.so.2
#7 0x00007f0b61b61f4b in ?? () from /usr/lib/libtbb.so.2
#8 0x00007f0b61b5e0ff in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f0b61b5e2f9 in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f0b61d82182 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007f0b60e5347d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb)

Revision history for this message
vageesan (vageesant) wrote :

core is in 10.84.5.31:/auto/cs-shared/bugs/1622618

Nischal Sheth (nsheth)
Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Nischal Sheth (nsheth)
Nischal Sheth (nsheth)
information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/24804
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/24805
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/24814
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/24805
Committed: http://github.org/Juniper/contrail-controller/commit/2650bbce88faecc3b2385386d9a341ea47643dc8
Submitter: Zuul
Branch: R3.1

commit 2650bbce88faecc3b2385386d9a341ea47643dc8
Author: Nischal Sheth <email address hidden>
Date: Mon Oct 10 11:43:16 2016 -0700

Handle more than 1 level of resolution in PathResolver

The code didn't previously handle the scenario where a BgpRoute with
one or more resolved paths is itself used to resolve ResolverPaths.
This scenario caused a concurrency issue wherein one partition was
attempting to modify a BgpRoute (by adding/deleting resolved paths)
while another partition was trying to access the same BgpRoute (to
get the BgpPaths to use as nexthops).

Fix by introducing a read-write mutex in the ResolverRouteState to
serialize access to the corresponding BgpRoute. Take a write lock
on the ResolverRouteState for the BgpRoute being modified and a read
lock on the ResolverRouteState for the BgpRoute to the nexthop.

Change-Id: I485e197ee92d4ba2347d136fecaec29b0c068b65
Closes-Bug: 1622618

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/24814
Committed: http://github.org/Juniper/contrail-controller/commit/5bc8c070b7f7cbc6d9a85b247449c54c29686c81
Submitter: Zuul
Branch: R3.0

commit 5bc8c070b7f7cbc6d9a85b247449c54c29686c81
Author: Nischal Sheth <email address hidden>
Date: Mon Oct 10 11:43:16 2016 -0700

Handle more than 1 level of resolution in PathResolver

The code didn't previously handle the scenario where a BgpRoute with
one or more resolved paths is itself used to resolve ResolverPaths.
This scenario caused a concurrency issue wherein one partition was
attempting to modify a BgpRoute (by adding/deleting resolved paths)
while another partition was trying to access the same BgpRoute (to
get the BgpPaths to use as nexthops).

Fix by introducing a read-write mutex in the ResolverRouteState to
serialize access to the corresponding BgpRoute. Take a write lock
on the ResolverRouteState for the BgpRoute being modified and a read
lock on the ResolverRouteState for the BgpRoute to the nexthop.

Change-Id: I485e197ee92d4ba2347d136fecaec29b0c068b65
Closes-Bug: 1622618

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/25185
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/24804
Committed: http://github.org/Juniper/contrail-controller/commit/471b5c339c52c166eea9e4686810d14b75fcb426
Submitter: Zuul
Branch: master

commit 471b5c339c52c166eea9e4686810d14b75fcb426
Author: Nischal Sheth <email address hidden>
Date: Mon Oct 10 11:43:16 2016 -0700

Handle more than 1 level of resolution in PathResolver

The code didn't previously handle the scenario where a BgpRoute with
one or more resolved paths is itself used to resolve ResolverPaths.
This scenario caused a concurrency issue wherein one partition was
attempting to modify a BgpRoute (by adding/deleting resolved paths)
while another partition was trying to access the same BgpRoute (to
get the BgpPaths to use as nexthops).

Fix by introducing a read-write mutex in the ResolverRouteState to
serialize access to the corresponding BgpRoute. Take a write lock
on the ResolverRouteState for the BgpRoute being modified and a read
lock on the ResolverRouteState for the BgpRoute to the nexthop.

Change-Id: I485e197ee92d4ba2347d136fecaec29b0c068b65
Closes-Bug: 1622618

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/25185
Committed: http://github.org/Juniper/contrail-controller/commit/a6be0b0377e617caf2e24f59db57d7edc0ec28be
Submitter: Zuul
Branch: R3.2

commit a6be0b0377e617caf2e24f59db57d7edc0ec28be
Author: Nischal Sheth <email address hidden>
Date: Mon Oct 10 11:43:16 2016 -0700

Handle more than 1 level of resolution in PathResolver

The code didn't previously handle the scenario where a BgpRoute with
one or more resolved paths is itself used to resolve ResolverPaths.
This scenario caused a concurrency issue wherein one partition was
attempting to modify a BgpRoute (by adding/deleting resolved paths)
while another partition was trying to access the same BgpRoute (to
get the BgpPaths to use as nexthops).

Fix by introducing a read-write mutex in the ResolverRouteState to
serialize access to the corresponding BgpRoute. Take a write lock
on the ResolverRouteState for the BgpRoute being modified and a read
lock on the ResolverRouteState for the BgpRoute to the nexthop.

Change-Id: I485e197ee92d4ba2347d136fecaec29b0c068b65
Closes-Bug: 1622618

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.