Cassandra exited with error 'Too many open files in system'

Bug #1582100 reported by Sandip Dey
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Megh Bhatt
R2.21.x
Fix Committed
High
Megh Bhatt
R2.22.x
Fix Committed
High
Megh Bhatt
R3.0
Fix Committed
High
Megh Bhatt
Trunk
Fix Committed
High
Megh Bhatt

Bug Description

Logs saved at :http://10.204.216.50/Docs/bugs/<bug-id>

Build:3.02 35 kilo

Had the setup as below

Setup was running around 140 vms with traffic.

The cassandra got exited in all the 3 database nodes with the below error

org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files in system

Logs
====

WARN [Thread-13] 2016-05-14 21:41:20,331 CustomTThreadPoolServer.java:122 - Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files in system
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:108) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:60) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:137) [apache-cassandra-2.1.13.jar:2.1.13]
Caused by: java.net.SocketException: Too many open files in system
        at java.net.PlainSocketImpl.socketAccept(Native Method) ~[na:1.7.0_95]
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) ~[na:1.7.0_95]
        at java.net.ServerSocket.implAccept(ServerSocket.java:530) ~[na:1.7.0_95]
        at java.net.ServerSocket.accept(ServerSocket.java:498) ~[na:1.7.0_95]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:102) ~[apache-cassandra-2.1.13.jar:2.1.13]
        ... 4 common frames omitted
root@nodei27:~#
root@nodei27:~#
root@nodei27:~# uptime
 12:31:58 up 5 days, 2:33, 1 user, load average: 3.08, 2.11, 1.87

setup
======
host1 = 'root@10.204.217.139'
host2 = 'root@10.204.217.140'
host3 = 'root@10.204.217.147'
host4 = 'root@10.204.217.144'
host5 = 'root@10.204.217.147'
host6 = 'root@10.204.217.148'
host7 = 'root@10.204.217.149'
host8 = 'root@10.204.217.150'
host9 = 'root@10.204.217.210'
host10 = 'root@10.204.217.217'
host11 = 'root@10.204.217.218'
host12 = 'root@10.204.217.220'
host13 = 'root@10.204.217.247'
host14 = 'root@10.204.217.248'
host15 = 'root@10.204.217.249'
host16 = 'root@10.204.217.118'
host17 = 'root@10.204.217.119'
host18 = 'root@10.204.217.120'
host19 = 'root@10.204.217.121'
host20 = 'root@10.204.217.122'
host21 = 'root@10.204.217.123'
host22 = 'root@10.204.217.124'
host23 = 'root@10.204.217.131'

ext_routers = [('blr-mx2', '10.204.216.245')]
router_asn = 64512
public_vn_rtgt = 30001
#public_vn_subnet = "10.204.219.72/29"

host_build = 'vjoshi@10.204.216.56'

env.roledefs = {
    'all': [host1, host2, host3, host4, host5, host6,host7, host8, host9, host10, host11, host12, host13, host14, host15,host16,host17,host18,host19,host20,host21,host22,host23],
    'cfgm': [host1, host2, host3],
    'openstack': [host4, host5, host6],
    'webui': [host1, host2, host3],
    'control': [host1, host2, host3],
    'compute': [host7, host8, host9, host10, host11, host12, host13, host14, host15,host16,host17,host18,host19,host20,host21,host22,host23],
    'collector': [host1, host2, host3],
    'database': [host1, host2, host3],
    'build': [host_build],
}

env.hostnames = {
    'all': ['nodei27', 'nodei28', 'nodei35', 'nodei32', 'nodei35', 'nodei36', 'nodei37', 'nodei38', 'nodel4', 'nodel7', 'nodel8', 'nodel9', 'nodel10', 'nodel11', 'nodel12','nodei6','nodei7','nodei8','nodei9','nodei10','nodei11','nodei12','nodei19']
}

Raj Reddy (rajreddy)
Changed in juniperopenstack:
assignee: Raj Reddy (rajreddy) → Megh Bhatt (meghb)
importance: Undecided → High
tags: added: blocker
Megh Bhatt (meghb)
Changed in juniperopenstack:
milestone: none → r3.1.0.0-fcs
no longer affects: juniperopenstack/r2.0
tags: added: releasenote
information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20495
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20495
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/12c4cd17e3a77feb82af67bcee976ebd86c09108
Submitter: Zuul
Branch: R3.0

commit 12c4cd17e3a77feb82af67bcee976ebd86c09108
Author: Megh Bhatt <email address hidden>
Date: Fri May 20 23:10:33 2016 -0700

Increase system wide fd limit for controller to 165k

Change-Id: I474d1956ea821ebd4420cb0962aa6f24fb586300
Closes-Bug: #1582100

Revision history for this message
Raj Reddy (rajreddy) wrote :

Megh, please commit the same change in puppet too.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20539
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20540
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20541
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20542
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22-dev

Review in progress for https://review.opencontrail.org/20543
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20544
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20545
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20547
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/20549
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/20550
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20539
Committed: http://github.org/Juniper/contrail-puppet/commit/19029f4c56433b3c1f45367f6a658aabd6d56a8e
Submitter: Zuul
Branch: R3.0

commit 19029f4c56433b3c1f45367f6a658aabd6d56a8e
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 15:13:22 2016 -0700

Increase system fd limit to 165K for database node

Change-Id: I095470ee0746a48f4867076712fbce895dcf5165
Closes-Bug: #1582100

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20545
Committed: http://github.org/Juniper/contrail-puppet/commit/2b33a94b42a0037076be56d965f7b14f61e90175
Submitter: Zuul
Branch: R2.20

commit 2b33a94b42a0037076be56d965f7b14f61e90175
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 16:15:05 2016 -0700

Increase system fd limit to 165K for database node

Closes-Bug: #1582100

Conflicts:
 contrail/environment/modules/contrail/manifests/database.pp

Change-Id: If7560b46f662b9ef5f2deae2f3196025813664a1

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20542
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/17e47f2fcbec767b5fdbc428a0b1f073dd1cf619
Submitter: Zuul
Branch: R2.20

commit 17e47f2fcbec767b5fdbc428a0b1f073dd1cf619
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 15:25:29 2016 -0700

Increase system wide fd limit for controller to 165k

Change-Id: I8f1e837ddfdb5f68e9f4658c9309ffc816562696
Closes-Bug: #1582100

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20549
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/4aa6010f490e0d9774ef65a894e65ef4ca7f9b73
Submitter: Zuul
Branch: R2.21.x

commit 4aa6010f490e0d9774ef65a894e65ef4ca7f9b73
Author: Megh Bhatt <email address hidden>
Date: Fri May 20 23:10:33 2016 -0700

Increase system wide fd limit for controller to 165k

Change-Id: I474d1956ea821ebd4420cb0962aa6f24fb586300
Closes-Bug: #1582100
(cherry picked from commit 12c4cd17e3a77feb82af67bcee976ebd86c09108)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20550
Committed: http://github.org/Juniper/contrail-puppet/commit/60e8a27ad07072683b47442d3ae65254f0377c4d
Submitter: Zuul
Branch: R2.21.x

commit 60e8a27ad07072683b47442d3ae65254f0377c4d
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 16:15:05 2016 -0700

Increase system fd limit to 165K for database node

Closes-Bug: #1582100

Conflicts:
 contrail/environment/modules/contrail/manifests/database.pp

Change-Id: If7560b46f662b9ef5f2deae2f3196025813664a1

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20544
Committed: http://github.org/Juniper/contrail-puppet/commit/50ddd46acf458455b71210e75398f78d78180c81
Submitter: Zuul
Branch: R2.22.x

commit 50ddd46acf458455b71210e75398f78d78180c81
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 16:15:05 2016 -0700

Increase system fd limit to 165K for database node

Change-Id: If7560b46f662b9ef5f2deae2f3196025813664a1
Closes-Bug: #1582100

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20547
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/bdf91b85c3df5b74c679193898313858e000a959
Submitter: Zuul
Branch: R2.22.x

commit bdf91b85c3df5b74c679193898313858e000a959
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 15:25:29 2016 -0700

Increase system wide fd limit for controller to 165k

Change-Id: I8f1e837ddfdb5f68e9f4658c9309ffc816562696
Closes-Bug: #1582100

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20541
Committed: http://github.org/Juniper/contrail-puppet/commit/04bd15f562e91928567429989236fb9559e68d92
Submitter: Zuul
Branch: master

commit 04bd15f562e91928567429989236fb9559e68d92
Author: Megh Bhatt <email address hidden>
Date: Mon May 23 15:13:22 2016 -0700

Increase system fd limit to 165K for database node

Change-Id: I095470ee0746a48f4867076712fbce895dcf5165
Closes-Bug: #1582100

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20540
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/06344720dfe62ee520a65a351d764984537c6b8c
Submitter: Zuul
Branch: master

commit 06344720dfe62ee520a65a351d764984537c6b8c
Author: Megh Bhatt <email address hidden>
Date: Fri May 20 23:10:33 2016 -0700

Increase system wide fd limit for controller to 165k

Change-Id: I474d1956ea821ebd4420cb0962aa6f24fb586300
Closes-Bug: #1582100
(cherry picked from commit 12c4cd17e3a77feb82af67bcee976ebd86c09108)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.