async_task_executor should handle exceptions and requeue tasks if applicable

Bug #1412576 reported by Charles Wang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MagnetoDB
New
Undecided
Unassigned

Bug Description

async_task_executor needs to handle exceptions and recover if possible by requeuing the failed tasks. Many times C* connection is temporarily unavailable, or schema is in disagreement but it will recover at retry. The failed tasks should be requeued to be processed again.

WARNING cassandra.cluster [-] [control connection] Error connecting to 192.168.19.241:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cassandra/cluster.py", line 1770, in _reconnect_internal
    return self._try_connect(host)
  File "/usr/lib/python2.7/dist-packages/cassandra/cluster.py", line 1787, in _try_connect
    connection = self._cluster.connection_factory(host.address, is_control_connection=True)
  File "/usr/lib/python2.7/dist-packages/cassandra/cluster.py", line 645, in connection_factory
    return self.connection_class.factory(address, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/magnetodb/common/cassandra/io/eventletreactor.py", line 59, in factory
    conn = cls(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/magnetodb/common/cassandra/io/eventletreactor.py", line 95, in __init__
    [a[4] for a in addresses], sockerr.strerror)
error: [Errno 111] Tried connecting to [('192.168.19.241', 9042)]. Last error: ECONNREFUSED

WARNING cassandra.cluster [-] Node 192.168.19.242 is reporting a schema disagreement: {UUID('7f098817-75bf-34e0-955e-992a7a884418'): ['192.168.19.236'], UUID('0ca792c2-048b-3204-bf64-29e7cbb65cef'): ['192.168.19.238', '192.168.19.237', '192.168.19.241', '192.168.19.232', '192.168.19.243'], UUID('e5956b79-912e-3f0c-a800-55f8f4fd93ff'): ['192.168.19.234', '192.168.19.235', '192.168.19.239', '192.168.19.240', '192.168.19.233'], UUID('7ff47210-445a-31f1-824f-75a24367fb6d'): [u'192.168.19.242']}
2014-12-30 10:16:02.316 3518 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
    incoming.message))
  File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
  File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
    result = getattr(endpoint, method)(ctxt, **new_args)
  File "/usr/bin/magnetodb-async-task-executor", line 67, in create
    table_info = self._table_info_repo.get(context, table_name)
  File "/usr/lib/python2.7/dist-packages/magnetodb/storage/table_info_repo/cassandra_impl.py", line 76, in get
    self.__refresh(context, table_info)
  File "/usr/lib/python2.7/dist-packages/magnetodb/storage/table_info_repo/cassandra_impl.py", line 127, in __refresh
    "".join(query_builder), consistent=True
  File "/usr/lib/python2.7/dist-packages/magnetodb/common/cassandra/cluster_handler.py", line 142, in execute_query
    raise ClusterIsNotConnectedException()
ClusterIsNotConnectedException

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.