After powercycling of control nodes leads to cassandra corruption and service is being stopped.
We are hitting the following open bug of cassandra
https://issues.apache.org/jira/browse/CASSANDRA-10534
Megh had a look at the issue and gave following workarond.
Workaround :
Modify cassandra policy in cassandra.yaml on failed nodes.
1) disk failure policy to best_effort
2) service contrail-database start
3) nodetool scrub
Logs:
NFO [SSTableBatchOpen:1] 2016-02-25 11:48:00,194 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/schema_triggers-0359bc7171233ee19a4ab9dfb11fc125/system-sc
INFO [main] 2016-02-25 11:48:01,772 AutoSavingCache.java:146 - reading saved cache /var/lib/cassandra/saved_caches/system-schema_triggers-0359bc7171233ee19a4ab9dfb11fc125-K
INFO [main] 2016-02-25 11:48:02,068 ColumnFamilyStore.java:363 - Initializing system.compaction_history
INFO [SSTableBatchOpen:3] 2016-02-25 11:48:02,148 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system
bytes)
INFO [SSTableBatchOpen:2] 2016-02-25 11:48:02,155 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system
tes)
INFO [SSTableBatchOpen:1] 2016-02-25 11:48:02,210 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system
ytes)
ERROR [SSTableBatchOpen:2] 2016-02-25 11:48:02,332 FileUtils.java:447 - Exiting forcefully due to file system exception on startup, disk failure policy "stop"
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_85]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_85]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_85]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_85]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
Caused by: java.io.EOFException: null
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_85]
at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_85]
at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_85]
at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[apache-cassandra-2.1.9.jar:2.1.9]
... 14 common frames omitted
Needs upgrade of cassandra