Rework bgp membership manager to improve scalability

Bug #1577278 reported by Nischal Sheth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
Wishlist
Nischal Sheth

Bug Description

Existing implementation triggers a table walk for each (peer, table) join
or leave request. It also triggers a separate walk per (peer, table) when
all paths received from a peer need to to be marked stale/deleted as part
of graceful restart.

This behavior is fine when a single peer comes up or goes down, but it's
sub-optimal when a bunch of peers go down or come up at roughly the same
time. This happens if multiple vrouters encounter the same problem and
crash or when the CN crashes and comes back up. In the latter case, we run
into the so-called thundering herd problem wherein all vrouters connect
to the CN at roughly the same time and then register to a large number of
common tables.

This causes a few problems:

1. The CN performs a large number of unnecessary table walks. These could potentially be combined into a much smaller number.

2. If there's a large number of peers and a large number of tables, the CN
ends up triggering a very large number of walks at roughly the same time.
This puts an unnecessary burden on the TaskScheduler since each table walk
results in the creation of multiple Tasks (one per partition).

3. Not only does 1) above cause redundant table walks, it also results in
redundant calls to BgpExport::Join/Leave and it's callees. Would be ideal
to call the Join/Leave methods with a BitSet of peers to handle multiple
peers at once. Note that the Join/Leave methods already handle a BitSet.

4. Since Join/Leave processing is done for 1 peer at a time, we also end
up encoding each route update into a bgp/xmpp message for one or few peers
at a time. Would be ideal to encode each route once and send it to all
interested peers i.e. amortize the cost of encoding the update over many
peers.

Proposal is to rework implementation of bgp membership manager to address
all the above issues. The membership manager can keep track of all pending
(peer, table) requests and trigger a table walk for one table at a time.
It can perform join/leave and receive path manipulation operations for all
requesting peers for the table in question. Since each table is sharded
across all partitions, triggering a single table walk still allows the
Task infra to utilize all available threads/cores. Triggering one table
walk at a time also allows the membership manager to accumulate multiple
peer requests for all other tables.

Nischal Sheth (nsheth)
description: updated
Nischal Sheth (nsheth)
summary: - Rework bgp membership manager to improve efficiency
+ Rework bgp membership manager to improve scalability
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20035
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20035
Committed: http://github.org/Juniper/contrail-controller/commit/714a72ce8810e2c4c4ea17b271c3e72a8f542bb5
Submitter: Zuul
Branch: master

commit 714a72ce8810e2c4c4ea17b271c3e72a8f542bb5
Author: Nischal Sheth <email address hidden>
Date: Thu Apr 28 13:37:39 2016 -0700

Rework bgp membership manager to improve scalability

See bug description for motivation.

This is an initial version of the new implementation with unit
tests. See comments in bgp_membership.h for design details.

Pending items:

- Cleanup of APIs related to unregistration
- Integration with peer close and graceful restart code
- Integration with rest of control node
- More unit tests

Change-Id: Idc86148a988d7dc19c3ee1c3d05e22e63868ed59
Partial-Bug: 1577278

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20232
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20232
Committed: http://github.org/Juniper/contrail-controller/commit/72c29dc3fb64304d8f6a2bc1532910184d21b491
Submitter: Zuul
Branch: master

commit 72c29dc3fb64304d8f6a2bc1532910184d21b491
Author: Nischal Sheth <email address hidden>
Date: Thu May 12 10:38:53 2016 -0700

Add death tests for membership manager

Change-Id: Ia95c5b97bc32f83d64e9969f0bc61c14d08d5470
Partial-Bug: 1577278

Revision history for this message
Nischal Sheth (nsheth) wrote :

Further changes to membership manager committed via the following:

http://github.org/Juniper/contrail-controller/commit/74a1b10ba119a2400696869f243052c8e8606a0b

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21184
Submitter: Nischal Sheth (<email address hidden>)

Nischal Sheth (nsheth)
description: updated
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/21197
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21184
Committed: http://github.org/Juniper/contrail-controller/commit/ae6c359f4851a82ea0893fe367353ba4cf97d2be
Submitter: Zuul
Branch: master

commit ae6c359f4851a82ea0893fe367353ba4cf97d2be
Author: Nischal Sheth <email address hidden>
Date: Sat Jun 4 21:27:55 2016 -0700

Use TaskFire utility in bgp_membership_test.cc

Change-Id: Ideda6248769f2f379c922b79f40485c6f05bba37
Closes-Bug: 1577278

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.