Inconsistency in reporting cpu/memory information in UVE

Bug #1467407 reported by Sundaresan Rajangam
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Santosh Gupta
Trunk
Fix Committed
High
Santosh Gupta

Bug Description

While reviewing https://review.opencontrail.org/#/c/11607/, i had observed some inconsistencies in the reporting of cpu/memory info by different modules

analytics-node
- collector, query-engine, analytics-api and alarm-gen sends process cpu/memory info in ModuleCpuState and AnalyticsCpuState
- snmp-collector and topology doesn’t send cpu/mem info

config-node
- api, svc-monitor, schema sends process cpu/memory info in ModuleCpuState and ConfigCpuState
- discovery, device-manager doesn’t send cpu/mem info
- should we send cpu/mem info for ifmap service from config-nodemgr?

control-node
- control service sends process cpu/mem info in ControlCpuState

compute-node
- vrouter-agent, tor-agent sends process and system cpu/mem info in ComputeCpuState and VRouterStatsAgent

1) ModuleCpuState is deprecated and should be removed

2) api sends build_info and config_node_ip in ModuleCpuState, which is not correct
cfgm_cpuinfo.sandesh
struct ModuleCpuState {
    1: string name (key="ObjectConfigNode")
    2: optional bool deleted
    3: optional list<ModuleCpuInfo> module_cpu_info (aggtype="union")
    4: optional string build_info <<<<<<<<
    5: optional list<string> config_node_ip <<<<<<<
}

3) cpu_info should be removed from VRouterStatsAgent

4) ComputeCpuState structure is inconsistent with other modules
struct VrouterCpuInfo {
    1: u32 mem_virt
    2: double cpu_share
    3: u32 used_sys_mem
    4: double one_min_cpuload
    5: u32 mem_res
}

struct ComputeCpuState {
    1: string name (key="ObjectVRouter")
    2: optional bool deleted
    3: optional list<VrouterCpuInfo> cpu_info (tags=".mem_virt,.cpu_share,.mem_res", aggtype="union")
}

Other Modules follow the below structure

struct XXXCpuState {
    1: string name (key=“XYZ")
    2: optional bool deleted
    3: optional list<cpuinfo.ProcessCpuInfo> cpu_info (tags=".module_id,.mem_virt,.cpu_share,.mem_res",aggtype="union")
}

uve sandesh XXXCpuStateTrace {
    1: XXXCpuState data
}

VrouterCpuInfo has both process and system specific memory and cpu info.
ComputeCpuState should use cpuinfo.ProcessCpuInfo to report the process cpu/mem info
A new structure named SystemCpuInfo should be added in cpuinfo.sandesh and ComputeCpuState should report the system cpu/mem info using this new structure

5) analytics-api and analrm-gen uses opserver/cpuinfo.py to get cpu and memory info.
    config/common/vnc_cpu_info.py - code to send fetch cpu/mem info and send the UVE
    Presently, we are adding code in database-nodemgr to send cpu/mem info for cassandra, zookeeper and kafka
    Therefore, we should move cpuinfo.py to a common directory and all python modules should use that instead of defining their own version or duplicating. Should we create a common python package (say contrail-python-base) and add all the common modules there?

6) Since all the modules are expected to send the cpu/mem info, should we consider moving the reporting of cpu/mem info to the sandesh library? - Easy to maintain and minimal effort to ensure newly created generators sends cpu/memory info.

Tags: analytics
Raj Reddy (rajreddy)
tags: added: quench
Raj Reddy (rajreddy)
tags: removed: quench
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Bug update]

bug update...

Revision history for this message
Raj Reddy (rajreddy) wrote :

there are 2 more requirements
. add system cpu info for all node types
. add number of CPUs, cores and hyperthreads etc info for the computes

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20388
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20389
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20437
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20538
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20437
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20685
Submitter: Santosh Gupta (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20437
Committed: http://github.org/Juniper/contrail-controller/commit/da7614fbc7dcbd652523543941e746705e7336ef
Submitter: Zuul
Branch: master

commit da7614fbc7dcbd652523543941e746705e7336ef
Author: Santosh Gupta <email address hidden>
Date: Thu May 19 11:27:12 2016 -0700

Closes-Bug: 1467407

Added system memory/cpu and process mem/cpu information under NodeStatus hierarchy.

1. In current code, analytics, config and vrouter modules have duplicate code to send system cpu/mem info.
Moved the functionality to common codebase in nodemgr.
Now system cpu/mem info is send for all node types.
- NodeStatus >> system_mem_cpu_info
2. In current code, each process sends cpu/mem info individually.
Moved the functionality to common codebase in nodemgr to send cpu/mem info for all processes on the node.
- NodeStatus >> process_mem_cpu_info
3. Created new hierarchy under NodeStatus and added following fields for cpu info
sockets, cores_per_socket and threads_per_core.
These fields are sent only during nodemgr init for all node types.
- NodeStatus >> process_cpu_info
Obsolete hierarchy/duplicate code for the above will be deleted once UI moves to new hierarchy.

Change-Id: Ib28f22eb855413d88d928a26c4d502df9d975d2d

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20685
Committed: http://github.org/Juniper/contrail-controller/commit/6944d1732265cdbe6f9cb7e0eefb23525cba11e5
Submitter: Zuul
Branch: R3.0

commit 6944d1732265cdbe6f9cb7e0eefb23525cba11e5
Author: Santosh Gupta <email address hidden>
Date: Thu May 19 11:27:12 2016 -0700

Closes-Bug: 1467407

Added system memory/cpu and process mem/cpu information under NodeStatus hierarchy.

1. In current code, analytics, config and vrouter modules have duplicate code to send system cpu/mem info.
Moved the functionality to common codebase in nodemgr.
Now system cpu/mem info is send for all node types.
- NodeStatus >> system_mem_cpu_info
2. In current code, each process sends cpu/mem info individually.
Moved the functionality to common codebase in nodemgr to send cpu/mem info for all processes on the node.
- NodeStatus >> process_mem_cpu_info
3. Created new hierarchy under NodeStatus and added following fields for cpu info
sockets, cores_per_socket and threads_per_core.
These fields are sent only during nodemgr init for all node types.
- NodeStatus >> process_cpu_info
Obsolete hierarchy/duplicate code for the above will be deleted once UI moves to new hierarchy.

Conflicts:
 src/nodemgr/analytics_nodemgr/analytics_event_manager.py
 src/nodemgr/config_nodemgr/config_event_manager.py
 src/nodemgr/control_nodemgr/control_event_manager.py
 src/nodemgr/database_nodemgr/database_event_manager.py

Change-Id: Ib28f22eb855413d88d928a26c4d502df9d975d2d

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.