Hi Arvind, Please find answers for your queries inline 1) Are we having control and config on the same node or are they on different nodes? yes (attaching testbed.py for your reference) 2) Can you explain a bit about the script they are using to query analytics ? We want to understand how are the queries being issued. Are the issuing individual GET request for the VN and VMI and then timing the entire operation. Do they wait for individual requests to succeed and then issue the newer one ? Are they trying to determine the time taken to issue (300,600) GET requests vs (1500,3000) GET requests vs (3000, 6000) GET requests. [If there are more API requests more time will be taken, so can u clarify if my understanding of the issue is wrong] Customer has created script get_port_uve.py to query against the api servers. Customer runs the query as the following parameter python get_port_uve.py --api_ips 10.0.0.100:8081 10.0.0.124:9081 10.0.0.125:9081 10.0.0.126:9081 --search 0 0500_01 In the above example 10.0.0.124:8081 sv-24 Collector 10.0.0.125:9081 sv-25 Collector 10.0.0.126:9081 sv-26 Collector Customer has created dummy VN amd VMI. VMI (port) is created with following syntax vmi__ In the above command customer is searching for VMI ending with port 0500_01. This is a sequential search. Output of the script and the issue is as mentioned in the output below 1) The last time when all 3 nodes respond with non-zero-length json contents. 2) The time when remaining 2 nodes start to respond with non-zero-length json contents stably. For example, in case of large_sv-24_down.log, the timestamp of 1) before I shutdown sv-24 was 11:49:57. ------------------------------------------------------------------------------------------------------------- virtual-machine-interfaces at 2017/09/05 11:49:57 ------------------------------------------------------------------------------------------------------------- node object status result ---------------- --------------------------------------- ------- ----------------------------------------- 10.0.0.100:8081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f After shutting down sv-24 (10.0.0.124), request to that collector node timed out. This is expected and normal. However, request for some UVEs toward VIP (10.0.0.100) or remaining collectors (10.0.0.125/10.0.0.125) also timed out or result in zero-length json body for a while. From the analytics API point of view, this response is not collect and I count this as downtime. --------------------------------------------------------------------------------------------------------------------------------- virtual-machine-interfaces at 2017/09/05 11:52:22 --------------------------------------------------------------------------------------------------------------------------------- node object status result ---------------- --------------------------------------- ------- ------------------------------------------------------------- 10.0.0.100:8081 default-domain:mock6:vmi_sv-39_0500_01 200 {} <<<<< This is not expected. 10.0.0.100:8081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock6:vmi_sv-39_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock3:vmi_sv-36_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock4:vmi_sv-37_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock1:vmi_sv-34_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock2:vmi_sv-35_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock5:vmi_sv-38_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.125:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {} <<<<< This is not expected. 10.0.0.125:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {} <<<<< This is not expected. 10.0.0.126:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f So, the timestamp of 2) in this case is 11:52:27 and the downtime is 11:52:27 - 11:49:57 = 00:02:30. --------------------------------------------------------------------------------------------------------------------------------- virtual-machine-interfaces at 2017/09/05 11:52:27 --------------------------------------------------------------------------------------------------------------------------------- node object status result ---------------- --------------------------------------- ------- ------------------------------------------------------------- 10.0.0.100:8081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.100:8081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.124:9081 default-domain:mock6:vmi_sv-39_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock3:vmi_sv-36_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock4:vmi_sv-37_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock1:vmi_sv-34_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock2:vmi_sv-35_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.124:9081 default-domain:mock5:vmi_sv-38_0500_01 n/a Connection to 10.0.0.124:9081 timed out. (connect timeout=2) 10.0.0.125:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.125:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock6:vmi_sv-39_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock3:vmi_sv-36_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock4:vmi_sv-37_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock1:vmi_sv-34_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock2:vmi_sv-35_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 10.0.0.126:9081 default-domain:mock5:vmi_sv-38_0500_01 200 {"UveVMInterfaceAgent": {"ip6_active": f 3) Do they give time after each of the control-node shutdown and issuing query against analytics This can be checked by looking at the redis.log in analytics node and make sure no deletes are happening before issuing the query against analytics. What they are looking for is the response with 200 OK and not out put in it like the below 10.0.0.100:8081 default-domain:mock6:vmi_sv-39_0500_01 200 {} <<<<< This is not expected. The query keeps on running and in the above example we can see conenction timeout which indicates that the server is dowm. 4) Also the messages u have reported are unrelated to the issue reported by the customer. You are experiencing connectivity issues in analytics to redis. Before trying to issue queries, make sure contrail-status on analytics is ok. I was not able to replicate the issue in our lab. My setup was on VM and couldn't scale up for the number of VN and VMI. 5) contrail-version 3.1.3.0-75 Scripts attached here Best Regards, Vijay Kumar