cassandra does not come up/goes down with Too many files open error

Bug #1582829 reported by Megh Bhatt
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
New
High
Unassigned
R3.0
New
High
Unassigned

Bug Description

On scaled setups with compaction backed up, it is observed that cassandra does not come up/goes down with backtrace suggesting Too many files are open. The cassandra user limits are set to 100000 but the global system wide limit is only 65K.

sysctl -a | grep file-max
fs.file-max = 65535
root@nodei28:/var/log/cassandra#

On database nodes, we need to increase the global system wide file descriptor limit to be more than 100K - say to 200K as part of provisioning and SM.

INFO [main] 2016-05-17 12:22:32,392 CassandraDaemon.java:643 - No gossip backlog; proceeding
INFO [main] 2016-05-17 12:22:32,487 Server.java:155 - Netty using native Epoll event loop
INFO [main] 2016-05-17 12:22:32,534 Server.java:193 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c]
INFO [main] 2016-05-17 12:22:32,534 Server.java:194 - Starting listening for CQL clients on /192.168.1.3:9042...
INFO [main] 2016-05-17 12:22:32,610 ThriftServer.java:119 - Binding thrift service to /192.168.1.3:9160
INFO [Thread-12] 2016-05-17 12:22:32,618 ThriftServer.java:136 - Listening for thrift clients...
WARN [Thread-12] 2016-05-17 12:23:43,695 CustomTThreadPoolServer.java:122 - Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files in system
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:108) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:60) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:137) [apache-cassandra-2.1.13.jar:2.1.13]
Caused by: java.net.SocketException: Too many open files in system
        at java.net.PlainSocketImpl.socketAccept(Native Method) ~[na:1.7.0_95]
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) ~[na:1.7.0_95]
        at java.net.ServerSocket.implAccept(ServerSocket.java:530) ~[na:1.7.0_95]
        at java.net.ServerSocket.accept(ServerSocket.java:498) ~[na:1.7.0_95]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:102) ~[apache-cassandra-2.1.13.jar:2.1.13]
        ... 4 common frames omitted
WARN [Thread-12] 2016-05-17 12:23:43,696 CustomTThreadPoolServer.java:122 - Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files in system
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:108) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:60) ~[libthrift-0.9.2.jar:0.9.2]
        at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110) ~[apache-cassandra-2.1.13.jar:2.1.13]
        at org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:137) [apache-cassandra-2.1.13.jar:2.1.13]
Caused by: java.net.SocketException: Too many open files in system
        at java.net.PlainSocketImpl.socketAccept(Native Method) ~[na:1.7.0_95]
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) ~[na:1.7.0_95]
        at java.net.ServerSocket.implAccept(ServerSocket.java:530) ~[na:1.7.0_95]
        at java.net.ServerSocket.accept(ServerSocket.java:498) ~[na:1.7.0_95]
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:102) ~[apache-cassandra-2.1.13.jar:2.1.13]
        ... 4 common frames omitted

Tags: analytics
Megh Bhatt (meghb)
Changed in juniperopenstack:
milestone: none → r3.1.0.0-fcs
Jeba Paulaiyan (jebap)
information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.