contrail-logs command sometimes does not complete in time and process hogs memory
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R3.2 |
Fix Committed
|
High
|
mkheni | |||
Trunk |
Fix Committed
|
High
|
mkheni |
Bug Description
R3.2 Build 30
Below mail thread captures the issue details :
From: Megh Bhatt <email address hidden>
Date: Thursday, March 16, 2017 at 11:18 PM
To: Vedamurthy Ananth Joshi <email address hidden>, Sundaresan Rajangam <email address hidden>
Cc: Raj Reddy <email address hidden>, Nikhil Bansal <email address hidden>, Sudheendra Rao <email address hidden>
Subject: Re: contrail-logs hogging memory and stuck
Added Sundar,
It is not really stuck
write(1, "es = 1, dequeues = 1, enqueue_fa"..., 4096) = 4096
stat("/
write(1, "tes_sent = 0, messages_received "..., 4096) = 4096
stat("/
mmap(NULL, 262144, PROT_READ|
stat("/
mmap(NULL, 262144, PROT_READ|
munmap(
mmap(NULL, 262144, PROT_READ|
munmap(
mmap(NULL, 262144, PROT_READ|
mmap(NULL, 262144, PROT_READ|
write(1, "_dropped = 0, messages_
stat("/
mmap(NULL, 262144, PROT_READ|
mmap(NULL, 262144, PROT_READ|
mmap(NULL, 262144, PROT_READ|
munmap(
mmap(NULL, 262144, PROT_READ|
mmap(NULL, 262144, PROT_READ|
mmap(NULL, 262144, PROT_READ|
^CProcess 4094 detached
root@nodec7:~#
Looks like contrail-logs has taken around 13GB and is in D state for extended times which suggests uninterruptible sleep mostly due to waiting for disk/IO when writing
4094 root 20 0 25.189g 0.013t 2804 R 0.0 43.1 20:00.13 contrail-logs
4094 root 20 0 25.189g 0.013t 2804 D 0.0 43.1 20:00.33 contrail-logs
root@nodec7:~# ps aux | grep contrail-logs
root 997 0.0 0.0 10460 944 pts/12 R+ 23:11 0:00 grep --color=auto contrail-logs
root 4091 0.0 0.0 63664 1044 pts/0 Ss+ Mar15 0:00 sudo -S -p sudo password: /bin/bash -l -c /usr/bin/
root 4092 0.0 0.0 11132 680 pts/0 S+ Mar15 0:00 /bin/bash -l -c /usr/bin/
root 4094 1.2 44.1 26679096 14543776 pts/0 Dl+ Mar15 20:10 /usr/bin/python /usr/bin/
We had changed contrail-logs to do all read and then print all at once which might cause the D state, I think we will need to revert that change to avoid such issues and/or add some sort of sleep.
Also we have an option to specify output file in contrail-logs, that might help too rather than redirecting to a file.
Thanks
Megh
On Mar 16, 2017, at 3:37 AM, Vedamurthy Ananth Joshi <email address hidden> wrote:
Megh,
It happened again on R3.2 now on the same setup and Nikhil couldn’t find out much from strace.
Can you please take a look as soon as possible ? Node is the same : nodec7.
Sudhee has temporarily blocked the setup for you
Vedu
From: Megh Bhatt <email address hidden>
Date: Friday, February 10, 2017 at 9:47 AM
To: Vedamurthy Ananth Joshi <email address hidden>
Cc: Raj Reddy <email address hidden>, Nikhil Bansal <email address hidden>
Subject: Re: contrail-logs hogging memory and stuck
Can we do a strace on the process to check what it is doing?
Thanks
Megh
On Feb 9, 2017, at 7:34 PM, Vedamurthy Ananth Joshi <email address hidden> wrote:
Hmm…possibly it got killed somehow… what can I check next time it happens ?
Including Nikhil..
From: Megh Bhatt <email address hidden>
Date: Thursday, February 9, 2017 at 7:08 PM
To: Vedamurthy Ananth Joshi <email address hidden>
Cc: Raj Reddy <email address hidden>
Subject: Re: contrail-logs hogging memory and stuck
I logged on but could not see the process. Were you redirecting to a file?
Thanks
Megh
On Feb 9, 2017, at 6:09 PM, Vedamurthy Ananth Joshi <email address hidden> wrote:
Could you please take a look.. we need the setup as soon as possible
Vedu
Sent from my iPhone
Begin forwarded message:
From: Vedamurthy Ananth Joshi <email address hidden>
Date: 10 February 2017 at 1:37:09 AM IST
To: Contrail Systems Analytics Team <email address hidden>
Cc: cf-test <email address hidden>
Subject: contrail-logs hogging memory and stuck
On this node nodec7.
It has used up ~14G RSS already. Can you please help figure out what is happening ?
%MEM PID %CPU RSS VSZ TIME COMMAND
45.1 1498 1.5 14870128 24964224 00:19:30 /usr/bin/python /usr/bin/
Vedu
Review in progress for https:/ /review. opencontrail. org/30114
Submitter: mkheni (<email address hidden>)