need to limit the number of tor-agent crash files generated

Bug #1429781 reported by Vedamurthy Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
Medium
Sundaresan Rajangam
R2.20
Fix Committed
Medium
Sundaresan Rajangam

Bug Description

R2.1 Build 39 Ubuntu 14.04 Multi-node Icehouse setup

On one of the compute nodes, due to bug 1426513, tor-agents were crashing repeatedly.

No limit seems to be set on the number of such crash files generated.
At one point of time, i see ~5000 contrail-tor-agent crash files.

root@nodei38:/var/crashes# ls -ltr core.contrail-tor-ag.* | wc -l
5450
root@nodei38:/var/crashes# du -sh
222G .
root@nodei38:/var/crashes#

-rw------- 1 root root 152674304 Mar 9 15:44 core.contrail-tor-ag.6174.nodei38.1425896046
-rw------- 1 root root 153546752 Mar 9 15:44 core.contrail-tor-ag.6987.nodei38.1425896046
-rw------- 1 root root 153243648 Mar 9 15:44 core.contrail-tor-ag.7632.nodei38.1425896047
-rw------- 1 root root 161746944 Mar 9 15:44 core.contrail-tor-ag.7744.nodei38.1425896047
-rw------- 1 root root 153489408 Mar 9 15:44 core.contrail-tor-ag.8748.nodei38.1425896047
-rw------- 1 root root 161288192 Mar 9 15:44 core.contrail-tor-ag.9046.nodei38.1425896048
-rw------- 1 root root 152850432 Mar 9 15:44 core.contrail-tor-ag.9426.nodei38.1425896048
-rw------- 1 root root 153800704 Mar 9 15:44 core.contrail-tor-ag.12844.nodei38.1425896049
-rw------- 1 root root 154791936 Mar 9 15:44 core.contrail-tor-ag.12592.nodei38.1425896049
-rw------- 1 root root 153653248 Mar 9 15:44 core.contrail-tor-ag.13107.nodei38.1425896049
-rw------- 1 root root 154775552 Mar 9 15:44 core.contrail-tor-ag.13832.nodei38.1425896050
-rw------- 1 root root 154062848 Mar 9 15:44 core.contrail-tor-ag.14534.nodei38.1425896050
-rw------- 1 root root 162439168 Mar 9 15:44 core.contrail-tor-ag.15053.nodei38.1425896051
-rw------- 1 root root 161218560 Mar 9 15:44 core.contrail-tor-ag.15079.nodei38.1425896051
-rw------- 1 root root 154025984 Mar 9 15:44 core.contrail-tor-ag.16092.nodei38.1425896052
-rw------- 1 root root 153772032 Mar 9 15:44 core.contrail-tor-ag.17388.nodei38.1425896052
-rw------- 1 root root 153833472 Mar 9 15:44 core.contrail-tor-ag.17570.nodei38.1425896053
-rw------- 1 root root 153997312 Mar 9 15:44 core.contrail-tor-ag.17893.nodei38.1425896053
-rw------- 1 root root 162721792 Mar 9 15:44 core.contrail-tor-ag.19543.nodei38.1425896054
-rw------- 1 root root 161718272 Mar 9 15:44 core.contrail-tor-ag.19474.nodei38.1425896054

Tags: analytics
Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

nodemgr limits the number of cores per process to 4. Due to a bug, the core files are not removed.

tags: added: analytics
removed: vrouter
Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Sundaresan Rajangam (srajanga)
Changed in juniperopenstack:
status: New → In Progress
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10239
Submitter: Sundaresan Rajangam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10240
Submitter: Sundaresan Rajangam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10240
Committed: http://github.org/Juniper/contrail-controller/commit/883ba5714a61bbda104f23426234e858899435ac
Submitter: Zuul
Branch: R2.20

commit 883ba5714a61bbda104f23426234e858899435ac
Author: Sundaresan Rajangam <email address hidden>
Date: Tue May 12 06:41:38 2015 -0700

Fix issue with deletion of core file in nodemgr

When the number of cores for a process exceeds max_cores, then the
old core file is removed using os.remove(). Due to missing import
of os module, os.remove(core_file) fails and the exception is not
being handled/logged.

Change-Id: Ica434ef76f9d840a5f73e0fa7244de91ad7f8629
Closes-Bug: #1429781
(cherry picked from commit 7cead511508684cef1a124a08e6a17b0cb424e22)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/10239
Committed: http://github.org/Juniper/contrail-controller/commit/7cead511508684cef1a124a08e6a17b0cb424e22
Submitter: Zuul
Branch: master

commit 7cead511508684cef1a124a08e6a17b0cb424e22
Author: Sundaresan Rajangam <email address hidden>
Date: Tue May 12 06:41:38 2015 -0700

Fix issue with deletion of core file in nodemgr

When the number of cores for a process exceeds max_cores, then the
old core file is removed using os.remove(). Due to missing import
of os module, os.remove(core_file) fails and the exception is not
being handled/logged.

Change-Id: Ica434ef76f9d840a5f73e0fa7244de91ad7f8629
Closes-Bug: #1429781

Changed in juniperopenstack:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.