PDT/Scale: Took 20 minutes to converge traffic after restart ovsdb-server/Contrail2.2

Bug #1456776 reported by Anoop Kumar Sahu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Hari Prasad Killi
Trunk
Fix Committed
High
Hari Prasad Killi

Bug Description

Were told with 2.2, this problem will go away..But the problem is still seen

Gnats:
1087763-1 edit PDT/Scale: Took 20 minutes to converge traffic after restart ovsdb-server/Contrail2.2

=========copied from Gnats ====

Please refer to the topology and spirent chart attached. Restarted ovsdb-server on Leaf L1 that is running bidirectional traffic with S4 (VNI 50-950).

      S4 ....OVSDB ....L1

Mac *03:05* *05:01*

Traffic stopped completely, took bouncy rides before converging after 20 minutes approx. Detail logs,ovsdbdump can be found at /volume/labcores/PR/PR-#

While no traffic Recv on L1, it was observed S4 didn't had ovsdb remote mac

root@vdc-vcf-s4# run show ovsdb mac remote | match 05:01 | count
Count: 3 lines

{master:0}[edit]
root@vdc-vcf-s4# run show ethernet-switching table | match 05:01 | count
Count: 5 lines

Tags: blocker bms qfx
tags: added: blocker bms qfx
information type: Proprietary → Public
Revision history for this message
chhandak (chhandak) wrote :

Tried restarting ovsdb-server by cli as well as killing ovsdb-process by kill -9 command in QFX.
Mesured Broadcast and Unicast traffic loss. Observed following loss respectively

Brodcast 14-17 Sec (ARP Traffic. Tried 3 times)
Unicast 6-7 Sec (ICMP Traffic between BMS and Contrail VM. Tried 3 times)

Revision history for this message
Hari Prasad Killi (haripk) wrote :

Chhandak's observations above were with build 29.

Following were fixed by build 29:
1. SSL link between TOR-Agent and TOR was flapping due to errored packet send from TOR-Agent. This was fixed.
2. Few TOR-Agent crash fixes

With these fixes, the connection was stable and convergence is seen to be in the order of a few seconds.

Revision history for this message
Anoop Kumar Sahu (anoops) wrote :

The issue is still there and we see traffic converges after 4-5 minutes

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

The convergence time will remain at this level for high scale. For lower scale this has been fixed as part of
https://bugs.launchpad.net/juniperopenstack/+bug/1456284

Pl retry.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.