Few vrouter uves remained even after the vrouters were deleted on tor-scale setup
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.20 |
Fix Committed
|
High
|
Sundaresan Rajangam | |||
Trunk |
Fix Committed
|
High
|
Sundaresan Rajangam |
Bug Description
R2.20 Build 30 Ubuntu 14.04 Juno multi-node setup
In This scale setup host5 and host6 had 128 tor-agents each in active-backup mode
I then added another two nodes host7 and host9 for tors 65-128. Then, deleted the vrouter (tor-agent )objects 65-128 which were created on host5(nodei38) and host6 (nodei28)
After that, it was seen that uves for the 3 vrouter objects : nodei38-107, nodei28-85, nodei38-68 were still left undeleted
Sundar is looking at it. Per his initial analysis, during the bulk deletion around 27th May 8:20:37 PM, the collector got connection-reset from tor-agent , but that didnt trigger the tor-agents' UVE-deletion
env.roledefs = {
'all': [host2, host3, host4, host5, host6, host7, host8, host9],
'cfgm': [host2, host3, host4],
'openstack': [host2, host3, host4],
'webui': [host3],
'control': [host2, host3, host4],
'compute': [host5, host6, host7, host8, host9],
'collector': [host2, host3, host4],
'database': [host2, host3, host4],
'toragent': [host5, host6, host7, host9 ],
'tsn': [host5, host6, host7,host9 ],
'build': [host_build],
}
env.hostnames = {
'all': ['nodei34', 'nodei35', 'nodei36', 'nodei37', 'nodei38', 'nodei28', 'nodei27', 'nodei30']
}
Logs will be in http://
Using the contrail-logs utility, captured the SandeshModuleTrace [with reset_time] for the generators [nodei28: Compute: contrail- tor-agent: 85, nodei38: Compute: contrail- tor-agent: 68 and nodei38: Compute: contrail- tor-agent: 107], whose UVEs were not deleted from redis after session disconnect.
[contrail-logs o/p] contrail- collector: 0:None] [INVALID] : SandeshModuleSe rverTrace: 184320 [ModuleServerState: name = nodei28: Compute: contrail- tor-agent: 85, [generator_info: [hostname = nodei36, [GeneratorInfoAttr: connects = 1, connect_time = 1432662701172809, resets = 1, reset_time = 1432738230768822, in_clear = false]]]] contrail- collector: 0:None] [INVALID] : SandeshModuleSe rverTrace: 184321 [ModuleServerState: name = nodei28: Compute: contrail- vrouter- agent:0, [generator_info: [hostname = nodei36, [GeneratorInfoAttr: connects = 3, connect_time = 1432662701653925, resets = 3, reset_time = 1432738218352851, in_clear = false]]]]
2015 May 27 20:20:30.886329 nodei36 [Analytics:
2015 May 27 20:20:32.166892 nodei36 [Analytics:
[redis log] Compute: contrail- vrouter- agent:0 Compute: contrail- vrouter- agent:0 successful
[23085] 27 May 20:20:18.353 * DelRequest for nodei28:
....
....
[23085] 27 May 20:20:23.370 # Lua slow script detected: still in execution after 5017 milliseconds. You can try killing the script using the SCRIPT KILL command. <<<<<<
....
[23085] 27 May 20:20:32.119 * Delete Request for nodei28:
From the redis log, it may be noted that deletion of [nodei28: Compute: contrail- vrouter- agent:0] took ~14seonds [default lua-time-limit is set to 5seconds in redis.conf]. If the lua script runs for more than the configured time, then redis returns error to subsequent requests till the execution of the script is completed. Compute: contrail- tor-agent: 85] was sent [reset_time - 27 May 2015, 20:20:30] while the deletion script for the generator [nodei28: Compute: contrail- vrouter- agent:0] was still running and redis had already raised the red flag @ 27 May 20:20:23.370 due to time over run.
From the contrail-logs [see above], it may be observed that deletion request for the generator [nodei28:
The UVEs of the generators nodei38: Compute: contrail- tor-agent: 68 and nodei38: Compute: contrail- tor-agent: 107 were not deleted from redis for the same reason:
[contrail-logs o/p] contrail- collector: 0:None] [INVALID] : SandeshModuleSe rverTrace: 184364 [ModuleServerState: name = nodei38: Compute: contrail- tor-agent: 68, [generator_info: [hostname = nodei36, [GeneratorInfoAttr: connects = 3, connect_time = 1432660322547729, resets = 3, reset_time = 1432738237680826, in_clear = false]]]] contrail- collector: 0:None] [INVALID] : SandeshModuleSe rverTrace: 184365 [ModuleServerState: name = nodei38: Compute: contrail- tor-agent: 107, [generator_info: [hostname = nodei36, [GeneratorInfoAttr: connects = 3, connect_time = 1432696866885598, resets = 3, reset_time = 1432738237744824, in_clear = false]]]] contrail- collector: 0:None] [INVALID] : SandeshModuleSe rverTrace: 184366 [ModuleServerState: name = nodei38: Compute: contrail- vrouter- agent:0, [generator_info: [hos...
2015 May 27 20:20:46.432991 nodei36 [Analytics:
2015 May 27 20:20:46.507011 nodei36 [Analytics:
....
....
2015 May 27 20:21:07.713827 nodei36 [Analytics: