Xmpp update generation should use multiple Tasks
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R3.2 |
Fix Committed
|
Wishlist
|
Nischal Sheth | |||
Trunk |
Fix Committed
|
Wishlist
|
Nischal Sheth |
Bug Description
Existing SchedulingGroup
work in parallel. Each SchedulingGroup produces and sends updates for the
RibOuts and Peers that belong to it. A SchedulingGroup is created/updated
such that all for all Peers, all RibOuts they advertise are part of the
same SchedulingGroup and for all RibOuts, all their member Peers are part
of the same SchedulingGroup. This design avoids contention when writing to
a Peer.
The current design works ok for eBGP Peers as we create multiple RibOuts
for the same BgpTable, based on the peer type, asn and other parameters.
This results in creation of a SchedulingGroup for each set of peers that
share a peer type ASN etc. These SchedulingGroups can work in parallel.
However, this doesn't help with iBGP peers since they all end up using
the same RibOut.
Current design does not work well for xmpp as we create a single RibOut
per table for xmpp. With large number of xmpp peers and a large number of
tables, we effectively send up with a single SchedulingGroup for all xmpp
peers.
Producing xmpp updates is more expensive than producing bgp updates due to
use of xml. This is exacerbated by xmpp update generation being limited to
a single SchedulingGroup. The changes implemented as part of bug 1591399
optimized single threaded xmpp updates.
Further improvements will likely require use of Tasks running in parallel.
This can be done in one of 2 ways:
1. Create multiple xmpp RibOuts for each BgpTable and then pick one of the
RibOuts when an xmpp peer subscribes to a table. If the RibOut is picked based on a hash of the peer name/ip, we get multiple SchedulingGroups,
which can then produce and send updates in parallel. This is relatively
simple to implement. However, results were quite poor when we tried this
in the past. Note that creating multiple xmpp RibOuts for all BgpTables
consumes a fair amount of extra memory and increases export processing
overhead since each route needs to be evaluated N times, once per RibOut.
2. Modify SchedulingGroup
for different table partitions can be generated in parallel by using one
Task per partition. This will create some contention when writing bgp/xmpp
update messages to a peer since multiple Tasks can do a write at the same
time. However, the contention should be pretty low with a large number of
peers.
Review in progress for https:/ /review. opencontrail. org/23744
Submitter: Nischal Sheth (<email address hidden>)