TOR Agent goes into initializing state for both TSNs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenContrail |
New
|
Undecided
|
Unassigned |
Bug Description
There has been numerous instances where the TOR agent for a particular TOR goes into initializing state on both the TSNs.
Steps to reproduce:
a) RE switchover on a TOR with scaled VXLAN config (3K VNI, 100,000+ MACs)
b) The TA goes into initializing state for 10-15 minutes and recovers afterwards
root@ubuntu-
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
down)
contrail-
contrail-
down)
contrail-
down)
contrail-
down)
contrail-
contrail-
contrail-
contrail-
root@ubuntu-tsn:~# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
down)
contrail-
down)
contrail-
contrail-
contrail-
contrail-
connection down)
contrail-
down)
contrail-
contrail-
Keep-alive time on tor-agent is configured as 10s. In scale configurations, we see that OVSDB server is busy when new connection is done and tor-agent doesn't receive any packet / keep alive response. On 10s expiry, connection is deleted and new connection is done. This happens repeatedly due to which connection is not setup.
Workaround is to increase the keepalive timeout on Contrail to a higher value (in /etc/contrail/ contrail- tor-agent- <id>.conf) and restart tor-agent.