Flow setup sometimes takes upto 50ms even on a setup with less flows
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R3.0 |
Fix Committed
|
High
|
Praveen | |||
Trunk |
Fix Committed
|
High
|
Praveen |
Bug Description
R3.0.2.0 Build 26 with vrouter-agent fixes upto April-19
On this setup with 9 compute nodes, there are ~100 VMs across these 9 nodes.
A compute node nodel7 for instance has about 50 active flows.
A VM 10.1.1.4 on nodel7 initiates tcp hping3 traffic to 10.1.1.10 which is also on the same compute node.
hping3 -p 22 -S 10.1.1.10 -i u1000
Praveen observed that some flow ksync tasks could be causing this delay
------------------
Sample :
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77054 win=14600 rtt=34.3 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77055 win=14600 rtt=51.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77056 win=14600 rtt=50.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77057 win=14600 rtt=49.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77058 win=14600 rtt=48.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77059 win=14600 rtt=47.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77060 win=14600 rtt=46.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77061 win=14600 rtt=45.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77062 win=14600 rtt=44.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77063 win=14600 rtt=43.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77064 win=14600 rtt=42.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77065 win=14600 rtt=41.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77066 win=14600 rtt=40.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77067 win=14600 rtt=39.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77068 win=14600 rtt=38.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77069 win=14600 rtt=37.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77070 win=14600 rtt=36.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77071 win=14600 rtt=35.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77072 win=14600 rtt=34.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77073 win=14600 rtt=33.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77074 win=14600 rtt=32.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77075 win=14600 rtt=31.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77076 win=14600 rtt=37.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77077 win=14600 rtt=36.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77078 win=14600 rtt=35.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77079 win=14600 rtt=34.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77080 win=14600 rtt=33.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77081 win=14600 rtt=32.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77082 win=14600 rtt=31.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77083 win=14600 rtt=30.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77084 win=14600 rtt=29.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77085 win=14600 rtt=28.9 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77086 win=14600 rtt=27.9 ms
------------------
Normal flow setup rate is usually <10ms :
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=575 win=14600 rtt=1.0 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=576 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=577 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=578 win=14600 rtt=0.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=579 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=580 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=581 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=582 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=583 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=584 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=585 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=586 win=14600 rtt=1.2 ms
---------------
Tasktrace log :
2016-04-20 10:57:54.113 TaskTrace: 6 5 Run time(in usec) 23867 Proto work queue. Module 5 controller/
2016-04-20 10:58:01.780 TaskTrace: 2 3 Run time(in usec) 23225 Flow Delete Queue-3 controller/
2016-04-20 10:58:01.780 TaskTrace: 27 2 Run time(in usec) 23390 Flow Log Queue controller/
2016-04-20 10:58:02.331 TaskTrace: 6 5 Run time(in usec) 23828 Proto work queue. Module 5 controller/
2016-04-20 10:58:02.332 TaskTrace: 27 2 Run time(in usec) 26637 Flow Log Queue controller/
2016-04-20 10:58:02.332 TaskTrace: 2 3 Run time(in usec) 22125 Flow KSync Queue-3 controller/
2016-04-20 10:58:06.448 TaskTrace: 6 5 Run time(in usec) 27000 Proto work queue. Module 5 controller/
2016-04-20 10:58:10.560 TaskTrace: 6 5 Run time(in usec) 22942 Proto work queue. Module 5 controller/
2016-04-20 10:58:13.047 TaskTrace: 7 1 Run time(in usec) 21423 Agent Stats collector controller/
2016-04-20 10:58:13.066 TaskTrace: 2 3 Run time(in usec) 20541 Flow Delete Queue-3 controller/
2016-04-20 10:58:13.067 TaskTrace: 10 0 Run time(in usec) 20780 Function 1 controller/
2016-04-20 10:58:16.150 TaskTrace: 2 3 Run time(in usec) 33878 Flow Delete Queue-3 controller/
2016-04-20 10:58:16.150 TaskTrace: 27 2 Run time(in usec) 29346 Flow Log Queue controller/
2016-04-20 10:58:21.233 TaskTrace: 2 3 Run time(in usec) 133029 Flow KSync Queue-3 controller/
2016-04-20 10:58:21.233 TaskTrace: 2 3 Schedule delay(in usec) 133063 Flow Delete Queue-3 controller/
2016-04-20 10:58:21.233 TaskTrace: 2 3 TBB schedule time(in usec) 133087 Flow Delete Queue-3 controller/
2016-04-20 10:58:22.883 TaskTrace: 6 5 Run time(in usec) 20295 Proto work queue. Module 5 controller/
2016-04-20 10:58:35.199 TaskTrace: 6 5 Run time(in usec) 22592 Proto work queue. Module 5 controller/
2016-04-20 11:47:06.061 TaskTrace: 6 5 Run time(in usec) 100326 Proto work queue. Module 5 controller/
2016-04-20 11:55:08.878 TaskTrace: 2 0 Run time(in usec) 41437 Flow Event Queue-0 controller/
2016-04-20 11:55:08.878 TaskTrace: 2 3 Run time(in usec) 41653 Flow Event Queue-3 controller/
2016-04-20 11:55:08.878 TaskTrace: 2 1 Run time(in usec) 41778 Flow Event Queue-1 controller/
2016-04-20 11:55:08.878 TaskTrace: 11 0 Run time(in usec) 41413 Function 1 controller/
2016-04-20 11:55:41.105 TaskTrace: 2 3 Run time(in usec) 43590 Flow Event Queue-3 controller/
2016-04-20 11:55:41.106 TaskTrace: 2 1 Run time(in usec) 43765 Flow Event Queue-1 controller/
2016-04-20 11:55:41.106 TaskTrace: 2 0 Run time(in usec) 43832 Flow Event Queue-0 controller/
2016-04-20 11:55:41.106 TaskTrace: 11 0 Run time(in usec) 43913 Function 1 controller/
2016-04-20 11:57:49.006 TaskTrace: 10 0 Run time(in usec) 24894 Virtual-Machine UVE controller/
2016-04-20 12:55:31.033 TaskTrace: 6 5 Run time(in usec) 37324 Proto work queue. Module 5 controller/
2016-04-20 12:55:31.782 TaskTrace: 2 1 Run time(in usec) 24426 Flow KSync Queue-1 controller/
2016-04-20 13:42:49.087 TaskTrace: 2 1 Run time(in usec) 20218 Flow Event Queue-1 controller/
2016-04-20 13:42:49.090 TaskTrace: 11 0 Run time(in usec) 21785 Function 1 controller/
2016-04-20 13:42:49.090 TaskTrace: 27 2 Run time(in usec) 22409 Flow Log Queue controller/
tags: | added: performance |
Review in progress for https:/ /review. opencontrail. org/19595
Submitter: Praveen K V (<email address hidden>)