Comment 3 for bug 1768360

Revision history for this message
mkheni (mkheni) wrote :

In the contrail-query-engine.log, there are lots of below error-

2018-05-08 Tue 04:19:02:013.662 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139828687521536, Pid 23133]: Db_GetMultiRow:controller/src/database/cassandra/cql/cql_if.cc:2019: SELECT FROM Table: MessageTable Partition Key: [27bd89fe-ff11-401a-beb3-26ab144622f4] FAILED
2018-05-08 Tue 04:19:02:017.881 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139829287302912, Pid 23133]: ExecuteQuerySyncInternal:controller/src/database/cassandra/cql/cql_if.cc:899: SyncQuery: FAILED: Operation timed out - received only 0 responses.
2018-05-08 Tue 04:19:02:017.928 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139829287302912, Pid 23133]: Db_GetMultiRow:controller/src/database/cassandra/cql/cql_if.cc:2019: SELECT FROM Table: MessageTable Partition Key: [1f4568ed-52a3-4626-b459-8f96cc766d31] FAILED
2018-05-08 Tue 04:19:02:019.111 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830897915648, Pid 23133]: ExecuteQuerySyncInternal:controller/src/database/cassandra/cql/cql_if.cc:899: SyncQuery: FAILED: Operation timed out - received only 0 responses.
2018-05-08 Tue 04:19:02:019.139 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830897915648, Pid 23133]: Db_GetMultiRow:controller/src/database/cassandra/cql/cql_if.cc:2019: SELECT FROM Table: MessageTable Partition Key: [814b3ac3-f487-4af8-bc1e-f17a8f6a7d3e] FAILED
2018-05-08 Tue 04:19:02:020.258 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830876923648, Pid 23133]: ExecuteQuerySyncInternal:controller/src/database/cassandra/cql/cql_if.cc:899: SyncQuery: FAILED: Operation timed out - received only 0 responses.
2018-05-08 Tue 04:19:02:020.299 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830876923648, Pid 23133]: Db_GetMultiRow:controller/src/database/cassandra/cql/cql_if.cc:2019: SELECT FROM Table: MessageTable Partition Key: [fdaa1479-2f22-43b8-ac2d-44391a90387d] FAILED
2018-05-08 Tue 04:19:02:028.785 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830776293120, Pid 23133]: ExecuteQuerySyncInternal:controller/src/database/cassandra/cql/cql_if.cc:899: SyncQuery: FAILED: Operation timed out - received only 0 responses.
2018-05-08 Tue 04:19:02:028.883 UTC zalp1bcnal03.alp1b.cci.att.com [Thread 139830776293120, Pid 23133]: Db_GetMultiRow:controller/src/database/cassandra/cql/cql_if.cc:2019: SELECT FROM Table: MessageTable Partition Key: [db1c7608-5a01-4e99-b86d-47ad7eea2989] FAILED

"Operation Timed Out" error means either cassandra was under really heavy load or there were some networking issues.

Could you also attach contrail-collector.log and cassandra logs at the time of the failure from all the nodes?

Also, if the system is still in the failure state, could you set up a debugging session, that would be more helpful to figure out why is cassandra in bad state.

-Miraj