[R2.20.29] TOR Scale: After supervisor-vrouter restart it is taking more than 1 hour to resume traffic

Bug #1458880 reported by chhandak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Medium
Tapan Karwa
Trunk
Fix Committed
Medium
Tapan Karwa

Bug Description

In tor scale setup after restarting supervisor-vrouter on TSN/TOR Agent, it is taking more than 1 hour (preciously 3729 second) for traffic to resume . Most of the time is taken to download config from control node . Setup has only one TSN and 2 Control node

Scale Profile:
VNI: 8K
VMI:32K
LIF:16K

Traffic Profile
----------------------
ARP traffic from BMS. Query for one of the VM in Same VN. So TSN is proxying

service supervisor-vrouter restart
root@nodei9:~# date
Tue May 26 16:45:20 IST 2015 >>>>>>>>>>>>>>>>> Restarted supervisor-vrouter

VXLAN Table

 VNID NextHop
----------------
Error: No such file or directory
root@nodei9:~# date
Tue May 26 16:51:50 IST 2015>>>>>>>>>>>>>>>>>>>>> Still VXLAN info is unavailable in TSN

root@nodei9:~# vxlan --get 10550
VXLAN Table

 VNID NextHop
----------------
10550 13011
root@nodei9:~# date
Tue May 26 17:06:23 IST 2015>>>>>>>>>>>>>>>>>>>>>>VXLAN info available

Even now multicast route is not available . Which is taking rest of the time to get programmed
root@nodei9:~# vxlan --get 10550
VXLAN Table

 VNID NextHop
----------------
10550 13011
root@nodei9:~# nh --get 13011
Id:13011 Type:Vrf_Translate Fmly: AF_INET Flags:Valid, Vxlan, Rid:0 Ref_cnt:2 Vrf:2502
              Vrf:2502

root@nodei9:~# rt --dump 2502 --family bridge | grep ff:ff
239132 ff:ff:ff:ff:ff:ff L 10550 13021
root@nodei9:~# nh --get 13021
Id:13021 Type:Composite Fmly:AF_BRIDGE Flags:Valid, Multicast, L2, Rid:0 Ref_cnt:2 Vrf:2502
              Sub NH(label):

root@nodei9:~# nh --get 13021
Id:13021 Type:Composite Fmly:AF_BRIDGE Flags:Valid, Multicast, L2, Rid:0 Ref_cnt:2 Vrf:2502
              Sub NH(label):

chhandak (chhandak)
summary: - [R2.20.29] TOR Scale: After supervisor-vrouter restart is taking more
- than 1 hour to traffic resume
+ [R2.20.29] TOR Scale: After supervisor-vrouter restart it is taking more
+ than 1 hour to resume traffic
description: updated
Revision history for this message
chhandak (chhandak) wrote :
Changed in juniperopenstack:
importance: Undecided → Critical
tags: added: blocker
information type: Proprietary → Public
Changed in juniperopenstack:
assignee: nobody → Hari Prasad Killi (haripk)
tags: added: contrail-control
Revision history for this message
Ashish Ranjan (aranjan-n) wrote :

Per Chhandak now this has come down to less than 3 minutes.. We cannot improve from here ..
The next level of improvement will be when we implement GRES In controller node and Agen-ovsdb link.. This is tracked separately..

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.