Instance launched in contrail is unreachable

Bug #1518501 reported by Vijay Anand
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Manish Singh
R2.21.x
Fix Committed
High
Manish Singh
Trunk
Fix Committed
High
Manish Singh
OpenContrail
Status tracked in Trunk
Trunk
Fix Committed
High
Manish Singh

Bug Description

Build: 2.21 + patches (1507404/1464059 & 1507501)

Multi node setup (1 Control + 1 Compute)

Test: 50 instances

Observation:
1. 10 instances not reachable though instance (vSRX) dhcp resolution is success (vSRX interface hast IP)
2. OUT to IN, no traffic seen on tap interface
3. no flow created
4. IN to OUT, traffic seen on tap and flows are created but didn't reach destination

Problem consistently seen blocking our CSO scale testing.

Tags: vrouter
Revision history for this message
Hari Prasad Killi (haripk) wrote :

MPLS labels weren't freed, resulting in new labels exceeding the configured limit, causing the above behavior.

tags: added: vrouter
Changed in opencontrail:
importance: Undecided → High
assignee: nobody → Manish Singh (manishs)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/15674
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/15683
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/15674
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/15674
Committed: http://github.org/Juniper/contrail-controller/commit/8f60cc6f3c8f888ad4e11bd0f37d12376ee9578d
Submitter: Zuul
Branch: R2.20

commit 8f60cc6f3c8f888ad4e11bd0f37d12376ee9578d
Author: Manish <email address hidden>
Date: Tue Dec 8 22:59:03 2015 +0530

Mpls label from index vector not reused.

Problem:
In multicast evpn label, allocation happens from regular unicast range. When
this label is deleted it is not reset in index vector(Index vector manages
labels).
This used to result in freed label not being re-used by new allocations.

Solution:
Multicast has two sets of labels. First is fabric replication labels which are
reserved at init itself and remains till end of life.
Second set ie evpn label as mentioned above is allocated from non reserved
range. Both these labels are of type MCAST_NH. Now when evpn label is freed the
destructor used to check for MCAST_NH and skip de-allocation from index vector.
Fix is to check if mcast label is from fabric label reserved range and then
refrain from de-allocation. In all other case do the de-allocation.

Closes-bug: 1518501

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Change-Id: Idd584a84d30f768818de30bfad30a1d1d811447c

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/15683
Committed: http://github.org/Juniper/contrail-controller/commit/6389d9cd0106b8fa25c81c6fe373d5e27bfe8d6a
Submitter: Zuul
Branch: R2.21.x

commit 6389d9cd0106b8fa25c81c6fe373d5e27bfe8d6a
Author: Manish <email address hidden>
Date: Tue Dec 8 22:59:03 2015 +0530

Mpls label from index vector not reused.

Problem:
In multicast evpn label, allocation happens from regular unicast range. When
this label is deleted it is not reset in index vector(Index vector manages
labels).
This used to result in freed label not being re-used by new allocations.

Solution:
Multicast has two sets of labels. First is fabric replication labels which are
reserved at init itself and remains till end of life.
Second set ie evpn label as mentioned above is allocated from non reserved
range. Both these labels are of type MCAST_NH. Now when evpn label is freed the
destructor used to check for MCAST_NH and skip de-allocation from index vector.
Fix is to check if mcast label is from fabric label reserved range and then
refrain from de-allocation. In all other case do the de-allocation.

Closes-bug: 1518501

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Change-Id: Idd584a84d30f768818de30bfad30a1d1d811447c

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/16072
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/16074
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged
Download full text (6.5 KiB)

Reviewed: https://review.opencontrail.org/16074
Committed: http://github.org/Juniper/contrail-controller/commit/a56f464f0724811954e241fc33f07a0526a86311
Submitter: Zuul
Branch: master

commit a56f464f0724811954e241fc33f07a0526a86311
Author: Manish <email address hidden>
Date: Tue Dec 8 22:59:03 2015 +0530

Mpls label from index vector not reused.

Problem:
In multicast evpn label, allocation happens from regular unicast range. When
this label is deleted it is not reset in index vector(Index vector manages
labels).
This used to result in freed label not being re-used by new allocations.

Solution:
Multicast has two sets of labels. First is fabric replication labels which are
reserved at init itself and remains till end of life.
Second set ie evpn label as mentioned above is allocated from non reserved
range. Both these labels are of type MCAST_NH. Now when evpn label is freed the
destructor used to check for MCAST_NH and skip de-allocation from index vector.
Fix is to check if mcast label is from fabric label reserved range and then
refrain from de-allocation. In all other case do the de-allocation.

Closes-bug: 1518501

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Conflicts:
 src/vnsw/agent/cmn/agent.cc
 src/vnsw/agent/controller/controller_init.h

Change-Id: Idd584a84d30f768818de30bfad30a1d1d811447c
(cherry picked from commit 6389d9cd0106b8fa25c81c6fe373d5e27bfe8d6a)

Agent crash @Filltrace, AgentRouteTable::DeletePathFromPeer

Problem:
DeleteAllBgpPath deletes all path from CN. While deleting there may be some
paths which internally delete dependant paths and hence invalidating path
iterator.
Similar issue was observed here in bridge_route.cc:
https://bugs.launchpad.net/juniperopenstack/+bug/1508894

Solution:
Maintain a list of paths to be deleted and once path list is traversed,
go through this list and delete paths.

Change-Id: I72eedd275ddf8d0c81c5aff10f384b5c45ce3696
Closes-bug: 1524140
(cherry picked from commit 7e51c5e5aa7534840350674fb0ff2b59ecb74bb6)

Agent crash@ invalid path

Problem:
Deleteallbgppeerpath was executed and it will create a list of all paths to be
deleted. This used to include stale and non-stale paths from CN.
Later it will run over the list and delete them. During deletion of path by
fabric_multicast_peer, DeletePath routine also flushes out stale paths.
This is because it assumes that any non bgp peer path delete means route is no
more valid and no need of stale paths.
Now after deleting fabric_multicast_peer path when control returns to delete
remainders, it used to find stale path pointer(which as explained above gets
deleted by
fabric_multicast_peer).

Solution:
Dont push stale paths in to_be_deleted_paths list.
Once all paths are deleted from to_be_deleted_paths, explicitly call for squash
of stale paths, if any left.
Basically isolate deletion of stale and non stael paths.

Change-Id: I2ae025077502713767ac22f64e338e7e7cb50ac6
Closes-bug: 1524140
(cherry picked from commit e8dfd5b94f3ef1c650964d04285e2e4bd02b15b7)

Conflicts:
 src/vnsw/agent/test/test_l2route.cc

Agent crashed because of freed xmpp channel.

Problem:
When agent xmpp channel is deleted, event to bring down bgp peer associated with
same is is...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.