Contrail 2.21-14: Exceptionally high vrouter agent CPU utilization and memory utilization

Bug #1586687 reported by Deepak Jeyaraman
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Incomplete
Undecided
Unassigned
R2.21.x
Incomplete
Undecided
Unassigned

Bug Description

Observing very high CPU and memory Utilization with COntrail 2.21-14 on one of the computes.

Due to this, the instances inside (vSRX D40) are in db prompt or stalled. (See attached screenshot)

root@ccra-23:~# top

top - 12:55:52 up 37 days, 17:53, 1 user, load average: 10.98, 11.05, 12.00
Tasks: 549 total, 3 running, 546 sleeping, 0 stopped, 0 zombie
%Cpu(s): 20.0 us, 7.7 sy, 0.0 ni, 72.1 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 26391625+total, 26318873+used, 727536 free, 4304 buffers
KiB Swap: 26832486+total, 43453732 used, 22487113+free. 56792 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 3251 root 20 0 0.273t 0.240t 39608 S 1107 97.6 28928:49 contrail-vroute <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 7432 libvirt+ 20 0 8599748 1.389g 2092 S 100.1 0.6 742:38.36 qemu-system-x86
20507 libvirt+ 20 0 9080760 1.283g 11256 S 100.1 0.5 5:01.77 qemu-system-x86
29298 libvirt+ 20 0 8139464 846684 2320 S 7.0 0.3 327:26.59 qemu-system-x86
32272 libvirt+ 20 0 8459744 276488 6064 S 5.6 0.1 20:01.83 qemu-system-x86
 1042 root 39 19 0 0 0 S 2.3 0.0 105:35.26 kipmi0
46057 libvirt+ 20 0 8523708 925612 2032 S 1.7 0.4 1100:30 qemu-system-x86
 4847 libvirt+ 20 0 8931308 801564 2056 S 1.3 0.3 448:34.52 qemu-system-x86
29134 root 20 0 0 0 0 S 0.7 0.0 0:00.15 kworker/41:2
47032 root 20 0 23944 2060 1168 R 0.7 0.0 0:00.04 top
  120 root 20 0 0 0 0 S 0.3 0.0 3:54.53 ksoftirqd/3
 3249 root 20 0 233888 11724 2816 S 0.3 0.0 57:29.45 python
17047 root 20 0 0 0 0 S 0.3 0.0 4:48.36 kworker/38:0
22476 root 20 0 0 0 0 S 0.3 0.0 4:53.17 kworker/24:2
22878 root 20 0 0 0 0 S 0.3 0.0 0:00.16 kworker/23:0
32410 root 20 0 0 0 0 S 0.3 0.0 0:00.10 kworker/3:2
32539 nova 20 0 1762508 37512 4940 S 0.3 0.0 650:03.61 nova-compute
47161 root 20 0 0 0 0 S 0.3 0.0 0:01.07 kworker/9:1
    1 root 20 0 34464 2988 1284 S 0.0 0.0 10:55.67 init
    2 root 20 0 0 0 0 S 0.0 0.0 0:00.94 kthreadd
    3 root 20 0 0 0 0 S 0.0 0.0 1:50.99 ksoftirqd/0
    5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
    6 root 20 0 0 0 0 S 0.0 0.0 0:46.44 kworker/u96:0
    8 root 20 0 0 0 0 S 0.0 0.0 88:52.76 rcu_sched
    9 root 20 0 0 0 0 S 0.0 0.0 3:00.40 rcuos/0
   10 root 20 0 0 0 0 S 0.0 0.0 3:09.54 rcuos/1
   11 root 20 0 0 0 0 S 0.0 0.0 7:30.73 rcuos/2
   12 root 20 0 0 0 0 S 0.0 0.0 8:32.30 rcuos/3
   13 root 20 0 0 0 0 S 0.0 0.0 5:53.29 rcuos/4
   14 root 20 0 0 0 0 S 0.0 0.0 5:09.10 rcuos/5
   15 root 20 0 0 0 0 S 0.0 0.0 4:44.26 rcuos/6
   16 root 20 0 0 0 0 S 0.0 0.0 4:45.59 rcuos/7
   17 root 20 0 0 0 0 S 0.0 0.0 4:08.10 rcuos/8
   18 root 20 0 0 0 0 S 0.0 0.0 3:53.82 rcuos/9
   19 root 20 0 0 0 0 S 0.0 0.0 3:48.46 rcuos/10
   20 root 20 0 0 0 0 S 0.0 0.0 3:40.64 rcuos/11
   21 root 20 0 0 0 0 S 0.0 0.0 4:54.87 rcuos/12
   22 root 20 0 0 0 0 S 0.0 0.0 5:33.76 rcuos/13
   23 root 20 0 0 0 0 S 0.0 0.0 11:48.40 rcuos/14
   24 root 20 0 0 0 0 S 0.0 0.0 13:59.69 rcuos/15
   25 root 20 0 0 0 0 R 0.0 0.0 9:51.91 rcuos/16

root@ccra-23:/var/log/contrail# free -mh
             total used free shared buffers cached
Mem: 251G 250G 742M 308K 6.9M 82M
-/+ buffers/cache: 250G 831M
Swap: 255G 41G 214G

root@ccra-23:~# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-fabric-utils 2.21.1-14 14
contrail-install-packages 2.21.1-14~icehouse 14
contrail-lib 2.21.1-14 14
contrail-nodemgr 2.21.1-14 14

Logs from contrail-vrouter

Openstack Nova Compute status unchanged at:PROCESS_STATE_RUNNING
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Sending UVE:NodeStatusUVE(_context='', _scope='', _category='', _send_queue_enabled=True, _seqnum=0, _versionsig=2778367443, _source='ccra-23', _instance_id='0', _client=None, _type=6, _hints=1, _http_server=None, _logger=None, _more=False, _node_type='Compute', data=NodeStatus(status=None, name='ccra-23', deleted=None, disk_usage_info=[DiskPartitionUsageStats(partition_space_available_1k=4705116040, partition_space_used_1k=10157884, partition_name='/dev/mapper/ccra--23--vg-root', partition_type='ext4'), DiskPartitionUsageStats(partition_space_available_1k=162173, partition_space_used_1k=66358, partition_name='/dev/sda1', partition_type='ext2')], process_status=None, all_core_file_list=None, _table='ObjectVRouter', process_info=None, description=None), _module='contrail-vrouter-nodemgr', _level=2147483647, _timestamp=1464465481835722, _client_context='', _connect_to_collector=True, _role=0)Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
wokeup and found a line
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Warning novarc not sourced
Openstack Nova Compute status unchanged at:PROCESS_STATE_RUNNING
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Sending UVE:NodeStatusUVE(_context='', _scope='', _category='', _send_queue_enabled=True, _seqnum=0, _versionsig=2778367443, _source='ccra-23', _instance_id='0', _client=None, _type=6, _hints=1, _http_server=None, _logger=None, _more=False, _node_type='Compute', data=NodeStatus(status=None, name='ccra-23', deleted=None, disk_usage_info=[DiskPartitionUsageStats(partition_space_available_1k=4705115780, partition_space_used_1k=10158144, partition_name='/dev/mapper/ccra--23--vg-root', partition_type='ext4'), DiskPartitionUsageStats(partition_space_available_1k=162173, partition_space_used_1k=66358, partition_name='/dev/sda1', partition_type='ext2')], process_status=None, all_core_file_list=None, _table='ObjectVRouter', process_info=None, description=None), _module='contrail-vrouter-nodemgr', _level=2147483647, _timestamp=1464465542433660, _client_context='', _connect_to_collector=True, _role=0)Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
wokeup and found a line
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Warning novarc not sourced
Openstack Nova Compute status unchanged at:PROCESS_STATE_RUNNING
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Sending UVE:NodeStatusUVE(_context='', _scope='', _category='', _send_queue_enabled=True, _seqnum=0, _versionsig=2778367443, _source='ccra-23', _instance_id='0', _client=None, _type=6, _hints=1, _http_server=None, _logger=None, _more=False, _node_type='Compute', data=NodeStatus(status=None, name='ccra-23', deleted=None, disk_usage_info=[DiskPartitionUsageStats(partition_space_available_1k=4705115524, partition_space_used_1k=10158400, partition_name='/dev/mapper/ccra--23--vg-root', partition_type='ext4'), DiskPartitionUsageStats(partition_space_available_1k=162173, partition_space_used_1k=66358, partition_name='/dev/sda1', partition_type='ext2')], process_status=None, all_core_file_list=None, _table='ObjectVRouter', process_info=None, description=None), _module='contrail-vrouter-nodemgr', _level=2147483647, _timestamp=1464465603004002, _client_context='', _connect_to_collector=True, _role=0)Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
wokeup and found a line
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Warning novarc not sourced
Openstack Nova Compute status unchanged at:PROCESS_STATE_RUNNING
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Sending UVE:NodeStatusUVE(_context='', _scope='', _category='', _send_queue_enabled=True, _seqnum=0, _versionsig=2778367443, _source='ccra-23', _instance_id='0', _client=None, _type=6, _hints=1, _http_server=None, _logger=None, _more=False, _node_type='Compute', data=NodeStatus(status=None, name='ccra-23', deleted=None, disk_usage_info=[DiskPartitionUsageStats(partition_space_available_1k=4705115256, partition_space_used_1k=10158668, partition_name='/dev/mapper/ccra--23--vg-root', partition_type='ext4'), DiskPartitionUsageStats(partition_space_available_1k=162173, partition_space_used_1k=66358, partition_name='/dev/sda1', partition_type='ext2')], process_status=None, all_core_file_list=None, _table='ObjectVRouter', process_info=None, description=None), _module='contrail-vrouter-nodemgr', _level=2147483647, _timestamp=1464465661654802, _client_context='', _connect_to_collector=True, _role=0)Exception AssertionError: AssertionError() in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored

===============

On another node, running for 80 days, the flow table is empty, still the memory used is a staggering 250Gb (98%)

root@ccra-03:~# ps aux --sort -rss | head -5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 3314 32.0 98.1 276662488 259146904 ? Sl Mar09 37006:26 /usr/bin/contrail-vrouter-agent
nova 3243 1.3 0.0 1763512 55868 ? Ssl Mar09 1540:16 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
root 44981 0.2 0.0 125076 43776 ? Ssl May23 15:56 /usr/bin/ruby /usr/bin/puppet agent
syslog 1642 0.0 0.0 256608 32052 ? Ssl Mar09 17:34 rsyslogd
root@ccra-03:~# free -mh
             total used free shared buffers cached
Mem: 251G 250G 1.0G 240K 29M 87M
-/+ buffers/cache: 250G 1.2G
Swap: 255G 14G 241G
root@ccra-03:~# flow -l
Flow table(size 34078720, entries 532480)

Entries: Created 1437849 Added 1203688 Processed 1437849
(Created Flows/CPU: 843953 7924 7788 7478 9330 9167 9493 8858 8756 8863 6834 11346 2439 2530 2698 697 2242 1613 2090 1316 2611 2399 2514 1839 1657 4514 3098 3004 2623 3807 2730 3737 2950 4858 426234 4031 1202 759 974 327 1099 698 598 904 909 855 687 816)(oflows 0)

Action:F=Forward, D=Drop N=NAT(S=SNAT, D=DNAT, Ps=SPAT, Pd=DPAT, L=Link Local Port)
 Other:K(nh)=Key_Nexthop, S(nh)=RPF_Nexthop

 Index Source:Port Destination:Port Proto(V)
-------------------------------------------------------------------------

Tags: vrouter
Revision history for this message
Deepak Jeyaraman (jdeepak) wrote :
information type: Proprietary → Public
Revision history for this message
Deepak Jeyaraman (jdeepak) wrote :
description: updated
tags: added: vrouter
Changed in juniperopenstack:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.