Detect duplicate subscribe for routing-instance on CN

Bug #1431025 reported by Nischal Sheth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.1
Fix Released
Wishlist
Nischal Sheth
Trunk
Fix Released
Wishlist
Nischal Sheth

Bug Description

It will be useful to detect a duplicate subscribe on the CN, log an error
or warning message and trigger a close of the xmpp connection, so that
both sides can start with a clean slate. The duplicate subscribe could
be due to a bug on the agent side.

Another possibility would be to synthesize an unsubscribe, and queue up
a deferred subscribe for the routing-instance in question. The impact is
more contained, but there's a chance that the 2 sides won't still be in sync. Hence closing the connection is the preferred approach.

Nischal Sheth (nsheth)
description: updated
description: updated
summary: - Detect duplicate subscribe for routing-instance on the CN
+ Detect duplicate subscribe for routing-instance on CN
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.1

Review in progress for https://review.opencontrail.org/8148
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
Nischal Sheth (nsheth) wrote :

@Vinay

This is mostly a defensive check to make the code more robust.
Since there's no known issue which requires this fix, I think we
should skip 2.1.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/8148
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/8333
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8333
Committed: http://github.org/Juniper/contrail-controller/commit/2565e9ce5ccd35f2bf7dd4dac8b57c5d8f150353
Submitter: Zuul
Branch: master

commit 2565e9ce5ccd35f2bf7dd4dac8b57c5d8f150353
Author: Nischal Sheth <email address hidden>
Date: Fri Mar 6 11:28:22 2015 -0800

Handle duplicate subscribe by closing the xmpp connection

This protects against bugs and/or race conditions in agent code.
It allows both sides to recover cleanly, without getting the CN
stuck.

Change-Id: Icb7759fb2554987312add77e092741b9c5353580
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/8461
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8461
Committed: http://github.org/Juniper/contrail-controller/commit/8e473746ac867b6a9f2793ada02b0600f6b3e5d8
Submitter: Zuul
Branch: master

commit 8e473746ac867b6a9f2793ada02b0600f6b3e5d8
Author: Nischal Sheth <email address hidden>
Date: Tue Mar 17 10:19:58 2015 -0700

Add more checks to detect bad subscribe/unsubscribe

Change-Id: Ia3c6bd2308d58db78528858929297682d0474b71
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/8487
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/8490
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8487
Committed: http://github.org/Juniper/contrail-controller/commit/354e04230b8251d597af567560e5eb99e3ee1cec
Submitter: Zuul
Branch: master

commit 354e04230b8251d597af567560e5eb99e3ee1cec
Author: Nischal Sheth <email address hidden>
Date: Thu Mar 19 13:37:03 2015 -0700

Fix occasional failures in bgp_xmpp_deferq_test

Problem was that ResumePeerRibMembershipManager could get called too
soon in certain cases i.e. before all subscribes and unsubscribes for
the instance have been enqueued by the BgpXmppChannel. Fix by adding
checks to verify the number of instance subscribes/unsubscribes prior
to resuming the membership manager.

Change-Id: I9785b3002ddfbea2a7c68d848055253e356d4d03
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/8490
Committed: http://github.org/Juniper/contrail-controller/commit/ec6acf50d2bbf2e3883c97cdff51b09c2a9442e0
Submitter: Zuul
Branch: master

commit ec6acf50d2bbf2e3883c97cdff51b09c2a9442e0
Author: Nischal Sheth <email address hidden>
Date: Thu Mar 19 16:22:39 2015 -0700

Fix race condition in bgp_xmpp_deferq_test

We used to check that the session on agent is down after processing
a duplicate subscribe or a spurious unsubscribe at the CN. Some of
these tests fail once in a while because the session goes down and
comes back up before we check the session state.

Fix by checking the flap count on the agent instead of the state.

Change-Id: I434c2a334285e043193fe0f73fd783f69ebe7c18
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/8524
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8524
Committed: http://github.org/Juniper/contrail-controller/commit/363c93a57e58f4ef422366de4c87f857b7cb6ba1
Submitter: Zuul
Branch: master

commit 363c93a57e58f4ef422366de4c87f857b7cb6ba1
Author: Nischal Sheth <email address hidden>
Date: Fri Mar 20 14:16:44 2015 -0700

Add checks to detect bad subscribe/unsubscribe for deleted instance

Change-Id: I179ae9494905ff759dfebde84e35097506dd1fbb
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.1

Review in progress for https://review.opencontrail.org/9422
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/9423
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/9424
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/9425
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/9426
Submitter: Nischal Sheth (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9422
Committed: http://github.org/Juniper/contrail-controller/commit/d8b568a09b7a38c57dcfd1d088e3917390407431
Submitter: Zuul
Branch: R2.1

commit d8b568a09b7a38c57dcfd1d088e3917390407431
Author: Nischal Sheth <email address hidden>
Date: Fri Mar 6 11:28:22 2015 -0800

Handle duplicate subscribe by closing the xmpp connection

This protects against bugs and/or race conditions in agent code.
It allows both sides to recover cleanly, without getting the CN
stuck.

Change-Id: Icb7759fb2554987312add77e092741b9c5353580
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9423
Committed: http://github.org/Juniper/contrail-controller/commit/3ce6a90c54feededf77ea4efb5267656cba0fd76
Submitter: Zuul
Branch: R2.1

commit 3ce6a90c54feededf77ea4efb5267656cba0fd76
Author: Nischal Sheth <email address hidden>
Date: Tue Mar 17 10:19:58 2015 -0700

Add more checks to detect bad subscribe/unsubscribe

Change-Id: Ia3c6bd2308d58db78528858929297682d0474b71
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9424
Committed: http://github.org/Juniper/contrail-controller/commit/09a0d5623db1128e114878db5af39e4c518a307b
Submitter: Zuul
Branch: R2.1

commit 09a0d5623db1128e114878db5af39e4c518a307b
Author: Nischal Sheth <email address hidden>
Date: Thu Mar 19 13:37:03 2015 -0700

Fix occasional failures in bgp_xmpp_deferq_test

Problem was that ResumePeerRibMembershipManager could get called too
soon in certain cases i.e. before all subscribes and unsubscribes for
the instance have been enqueued by the BgpXmppChannel. Fix by adding
checks to verify the number of instance subscribes/unsubscribes prior
to resuming the membership manager.

Change-Id: I9785b3002ddfbea2a7c68d848055253e356d4d03
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9425
Committed: http://github.org/Juniper/contrail-controller/commit/c2cabe6c51f8327659e10c5dc4ca68af7e0082e2
Submitter: Zuul
Branch: R2.1

commit c2cabe6c51f8327659e10c5dc4ca68af7e0082e2
Author: Nischal Sheth <email address hidden>
Date: Thu Mar 19 16:22:39 2015 -0700

Fix race condition in bgp_xmpp_deferq_test

We used to check that the session on agent is down after processing
a duplicate subscribe or a spurious unsubscribe at the CN. Some of
these tests fail once in a while because the session goes down and
comes back up before we check the session state.

Fix by checking the flap count on the agent instead of the state.

Change-Id: I434c2a334285e043193fe0f73fd783f69ebe7c18
Closes-Bug: 1431025

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9426
Committed: http://github.org/Juniper/contrail-controller/commit/515ac8b42289bc2740a163111cdb824274258f79
Submitter: Zuul
Branch: R2.1

commit 515ac8b42289bc2740a163111cdb824274258f79
Author: Nischal Sheth <email address hidden>
Date: Fri Mar 20 14:16:44 2015 -0700

Add checks to detect bad subscribe/unsubscribe for deleted instance

Change-Id: I179ae9494905ff759dfebde84e35097506dd1fbb
Closes-Bug: 1431025

Nischal Sheth (nsheth)
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.