Stabilize control-node Graceful Restart unit tests

Bug #1733446 reported by Ananth Suryanarayana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.2
In Progress
Undecided
Ananth Suryanarayana
R4.0
In Progress
Undecided
Ananth Suryanarayana
R4.1
In Progress
Undecided
Ananth Suryanarayana
R5.0
In Progress
Undecided
Ananth Suryanarayana
Trunk
In Progress
Undecided
Ananth Suryanarayana

Bug Description

GR tests are quite flaky. Stabilize them asap. While it may not be possible to make them perfect in one attempt, we should set that as a goal..

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/37705
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/37706
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/37707
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/37708
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/37705
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/37706
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/37707
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/37708
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/37706
Committed: http://github.com/Juniper/contrail-controller/commit/6a6156f0dad8d66e0764de99dae3ce45de41683a
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 6a6156f0dad8d66e0764de99dae3ce45de41683a
Author: Ananth Suryanarayana <email address hidden>
Date: Mon Nov 20 16:33:40 2017 -0800

Stabilize control-node graceful-restart unit tests

o Checking that peers do not flip after simulating a cold-reboot is not consistent
-- If peer tries to send a message and get a RST back, then it would cause
its session to flip. Hence such a check must be done asap once the peer is
brought down on one side.

o Do not start SandeshServer by default. It hangs during TearDown() sometimes

Change-Id: Ie45bbdc5f1ed46903b693d32f5edf1f1d8d9c212
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/37707
Committed: http://github.com/Juniper/contrail-controller/commit/e68db1cef197ded8f0babace3d16305936a22026
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit e68db1cef197ded8f0babace3d16305936a22026
Author: Ananth Suryanarayana <email address hidden>
Date: Mon Nov 20 16:33:40 2017 -0800

Stabilize control-node graceful-restart unit tests

o Checking that peers do not flip after simulating a cold-reboot is not consistent
-- If peer tries to send a message and get a RST back, then it would cause
its session to flip. Hence such a check must be done asap once the peer is
brought down on one side.

o Do not start SandeshServer by default. It hangs during TearDown() sometimes

Change-Id: Ie45bbdc5f1ed46903b693d32f5edf1f1d8d9c212
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/37708
Committed: http://github.com/Juniper/contrail-controller/commit/c73cc7452247fdd3b460b8068dc59d1dabc95b21
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit c73cc7452247fdd3b460b8068dc59d1dabc95b21
Author: Ananth Suryanarayana <email address hidden>
Date: Mon Nov 20 16:33:40 2017 -0800

Stabilize control-node graceful-restart unit tests

o Checking that peers do not flip after simulating a cold-reboot is not consistent
-- If peer tries to send a message and get a RST back, then it would cause
its session to flip. Hence such a check must be done asap once the peer is
brought down on one side.

o Do not start SandeshServer by default. It hangs during TearDown() sometimes

Change-Id: Ie45bbdc5f1ed46903b693d32f5edf1f1d8d9c212
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/37705
Committed: http://github.com/Juniper/contrail-controller/commit/a0b12b57d7ee4fbf2b595454ce3fe11d4836958c
Submitter: Zuul (<email address hidden>)
Branch: master

commit a0b12b57d7ee4fbf2b595454ce3fe11d4836958c
Author: Ananth Suryanarayana <email address hidden>
Date: Mon Nov 20 16:33:40 2017 -0800

Stabilize control-node graceful-restart unit tests

o Checking that peers do not flip after simulating a cold-reboot is not consistent
-- If peer tries to send a message and get a RST back, then it would cause
its session to flip. Hence such a check must be done asap once the peer is
brought down on one side.

o Do not start SandeshServer by default. It hangs during TearDown() sometimes

Change-Id: Ie45bbdc5f1ed46903b693d32f5edf1f1d8d9c212
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/38875
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/38876
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/38877
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/38878
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/38875
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/38876
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/38877
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/38878
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/38877
Committed: http://github.com/Juniper/contrail-controller/commit/36a2c5177ddd075f972fe9a58b6ce75f597ab6cd
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 36a2c5177ddd075f972fe9a58b6ce75f597ab6cd
Author: Ananth Suryanarayana <email address hidden>
Date: Fri Jan 12 14:49:41 2018 -0800

Do not bring up peers when GR process is under progress

Peer sessions are not allowed to come up when GR is in progress. Once the state
enters GR_TIMER_WAIT or LLGR_TIMER_WAIT, then the sessions are allowed to come
up. However, it can so happen that when the peers are about to come up and
become established, the GR timers fire.. In this case, we do not want the peers
to come up to keep the design simple.

In such a rare scenario (which was exposed in some of the GR unit tests), flip
the peer gracefully.

It is to be noted in the normal GR state machine, most of the time is spent
in GR_TIMER_WAIT/LLGR_TIMER_WAIT states, which in the order of several minutes.
Hence it is not a common occurance for a peer session to come up at exactly the
same time as when the GR timers expire.

Tests marked as flaky still need to be monitored before unmarking some of them
if not all..

Change-Id: I497a3e84ed6222ae5cfc1cb80679c0a962e0ead9
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/38875
Committed: http://github.com/Juniper/contrail-controller/commit/5c7ec2c442b0a870320eacf2671c417d70f3e306
Submitter: Zuul (<email address hidden>)
Branch: master

commit 5c7ec2c442b0a870320eacf2671c417d70f3e306
Author: Ananth Suryanarayana <email address hidden>
Date: Fri Jan 12 14:49:41 2018 -0800

Do not bring up peers when GR process is under progress

Peer sessions are not allowed to come up when GR is in progress. Once the state
enters GR_TIMER_WAIT or LLGR_TIMER_WAIT, then the sessions are allowed to come
up. However, it can so happen that when the peers are about to come up and
become established, the GR timers fire.. In this case, we do not want the peers
to come up to keep the design simple.

In such a rare scenario (which was exposed in some of the GR unit tests), flip
the peer gracefully.

It is to be noted in the normal GR state machine, most of the time is spent
in GR_TIMER_WAIT/LLGR_TIMER_WAIT states, which in the order of several minutes.
Hence it is not a common occurance for a peer session to come up at exactly the
same time as when the GR timers expire.

Tests marked as flaky still need to be monitored before unmarking some of them
if not all..

Change-Id: I497a3e84ed6222ae5cfc1cb80679c0a962e0ead9
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/38876
Committed: http://github.com/Juniper/contrail-controller/commit/e4eae55d388101fbb1ab18a56fa6c759256e451c
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit e4eae55d388101fbb1ab18a56fa6c759256e451c
Author: Ananth Suryanarayana <email address hidden>
Date: Fri Jan 12 14:49:41 2018 -0800

Do not bring up peers when GR process is under progress

Peer sessions are not allowed to come up when GR is in progress. Once the state
enters GR_TIMER_WAIT or LLGR_TIMER_WAIT, then the sessions are allowed to come
up. However, it can so happen that when the peers are about to come up and
become established, the GR timers fire.. In this case, we do not want the peers
to come up to keep the design simple.

In such a rare scenario (which was exposed in some of the GR unit tests), flip
the peer gracefully.

It is to be noted in the normal GR state machine, most of the time is spent
in GR_TIMER_WAIT/LLGR_TIMER_WAIT states, which in the order of several minutes.
Hence it is not a common occurance for a peer session to come up at exactly the
same time as when the GR timers expire.

Tests marked as flaky still need to be monitored before unmarking some of them
if not all..

Change-Id: I497a3e84ed6222ae5cfc1cb80679c0a962e0ead9
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/38878
Committed: http://github.com/Juniper/contrail-controller/commit/f4fcf242107ed44fe1748dacdcc6ec92fad923f8
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit f4fcf242107ed44fe1748dacdcc6ec92fad923f8
Author: Ananth Suryanarayana <email address hidden>
Date: Fri Jan 12 14:49:41 2018 -0800

Do not bring up peers when GR process is under progress

Peer sessions are not allowed to come up when GR is in progress. Once the state
enters GR_TIMER_WAIT or LLGR_TIMER_WAIT, then the sessions are allowed to come
up. However, it can so happen that when the peers are about to come up and
become established, the GR timers fire.. In this case, we do not want the peers
to come up to keep the design simple.

In such a rare scenario (which was exposed in some of the GR unit tests), flip
the peer gracefully.

It is to be noted in the normal GR state machine, most of the time is spent
in GR_TIMER_WAIT/LLGR_TIMER_WAIT states, which in the order of several minutes.
Hence it is not a common occurance for a peer session to come up at exactly the
same time as when the GR timers expire.

Tests marked as flaky still need to be monitored before unmarking some of them
if not all..

Change-Id: I497a3e84ed6222ae5cfc1cb80679c0a962e0ead9
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/41143
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/41143
Committed: http://github.com/Juniper/contrail-controller/commit/3886c7ca0bf859a1e52973b946436aaebf22c8ab
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 3886c7ca0bf859a1e52973b946436aaebf22c8ab
Author: Ananth Suryanarayana <email address hidden>
Date: Wed Mar 28 10:36:57 2018 -0700

Mark graceful_restart_flap_some_test8 as flaky

Change-Id: Id29bcd00ae24bbc67f01c6c55d5b77af9c017044
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/46370
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R5.0

Review in progress for https://review.opencontrail.org/46371
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/46370
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R5.0

Review in progress for https://review.opencontrail.org/46371
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/46371
Committed: http://github.com/Juniper/contrail-controller/commit/c6c38f6595c96571c4ce8ce6424f2dc1e31068a7
Submitter: Zuul v3 CI (<email address hidden>)
Branch: R5.0

commit c6c38f6595c96571c4ce8ce6424f2dc1e31068a7
Author: Ananth Suryanarayana <email address hidden>
Date: Thu Sep 20 11:30:08 2018 -0700

Mark graceful_restart_flap_some_test3, graceful_restart_flap_some_test9 flaky

Until these tests are stabilized, they are marked as flaky so that other
reviews in CI are not affected unnecessarily

Change-Id: Iae85e098dddacdf42c63bf3635236f0b43dcde5d
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/46370
Committed: http://github.com/Juniper/contrail-controller/commit/ac228da66923c51f9d1aaa4d12cb6a54214811b7
Submitter: Vinay Vithal Mahuli (<email address hidden>)
Branch: master

commit ac228da66923c51f9d1aaa4d12cb6a54214811b7
Author: Ananth Suryanarayana <email address hidden>
Date: Thu Sep 20 11:30:08 2018 -0700

Mark graceful_restart_flap_some_test3, graceful_restart_flap_some_test9 flaky

Until these tests are stabilized, they are marked as flaky so that other
reviews in CI are not affected unnecessarily

Change-Id: Iae85e098dddacdf42c63bf3635236f0b43dcde5d
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/48494
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R5.0

Review in progress for https://review.opencontrail.org/48495
Submitter: Ananth Suryanarayana (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/48494
Committed: http://github.com/Juniper/contrail-controller/commit/8d9ac0235528f20175ce9d1c8020f274ec204397
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 8d9ac0235528f20175ce9d1c8020f274ec204397
Author: Ananth Suryanarayana <email address hidden>
Date: Sat Jan 5 19:45:32 2019 -0800

Mark graceful_restart_flap_all_test7 as flaky

Change-Id: I5d76b81a4019c6c126c0538699d2c9c1f07241d9
Partial-Bug: 1733446

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/48549
Submitter: Arun RS (<email address hidden>)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.