Comment 11 for bug 1531436

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17036
Committed: http://github.org/Juniper/contrail-controller/commit/9cfbc9d1109434e957ece0193cbaf01f9e9df5ac
Submitter: Zuul
Branch: R2.22.x

commit 9cfbc9d1109434e957ece0193cbaf01f9e9df5ac
Author: Manish <email address hidden>
Date: Wed Jan 6 11:54:13 2016 +0530

Agent fabric replication tree is empty.

Problem:
On switchover of mcast control node (because of lower IP), agent fabric tree was
geting removed. For further explanation lets assume there are two control-node
C1 and C2. C2 is in stable shape and came up first. Agent subscribes to fabric
replication tree by reaching C2. Now C1 comes up and before it can participate
meaningfully in replication it becomes unusable. Agent sees C1 coming up and
removes the fabric tree received from C2 as it plans to use C1 as multicast
node. However when C1 goes down, agent goes back to C2 and send subscribe again.
(Note: When C1 took over from C2, agent had triggered a walk to unsubscribe from
C2, but it may not be complete or scheduled). Now C2 starts same multicast walk
for subscription inturn cancelling old unsubscription walk.
Control-node on seeing subscription request again (since unsubscribe didnt go)
and no change in Olist or other info, may decide to ignore the request.
Hence agent will never get fabric tree.

Solution:
Firstly remove the logic of lower peer selection.
Secondly in every subscribe sent because of new control-node becoming mcast
controller for agent, check if unsubscribe is sent. If its not explicitly send
unsubscribe before subscribing.

Closes-bug: #1531436

Change-Id: I089e3137cdfbbd28897414c411b22b8c82260990
(cherry picked from commit 56b847efbf1601446fe84c4ea0f62991842e149a)
(cherry picked from commit d822e9eae2679db21f4087bdce7df6680f058291)