Ceilometer uses Tooz for agent coordination and configurable connection retries will be useful to build resilience against random connection failures.
For example i see this in notification agent logs:
(kazoo.client): 2015-09-11 18:49:35,331 DEBUG connection _submit Sending request(xid=2): Create(path=u'/tooz/ceilometer.notification/b279f2ed-fe04-4113-b374-4627745c711c', data='\xc4\x00', acl=[ACL(perms=31, acl_list=['ALL'], id=Id(scheme='world', id='anyone'))], flags=1) (kazoo.client): 2015-09-11 18:49:38,485 Level 5 connection _submit Sending request(xid=-2): Ping() (kazoo.client): 2015-09-11 18:49:41,450 WARNING connection _connect_attempt Connection dropped: outstanding heartbeat ping not received (kazoo.client): 2015-09-11 18:49:41,450 WARNING connection _connect_attempt Transition to CONNECTING (kazoo.client): 2015-09-11 18:49:41,450 INFO client _session_callback Zookeeper connection lost (ceilometer.openstack.common.threadgroup): 2015-09-11 18:49:41,463 ERROR threadgroup wait Traceback (most recent call last): File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/ceilometer/openstack/common/threadgroup.py", line 145, in wait x.wait() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/ceilometer/openstack/common/threadgroup.py", line 47, in wait return self.thread.wait() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait return self._exit_event.wait() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait return hubs.get_hub().switch() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch return self.greenlet.switch() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main result = function(*args, **kwargs) File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/ceilometer/openstack/common/service.py", line 491, in run_service service.start() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/ceilometer/notification.py", line 143, in start self.partition_coordinator.join_group(self.group_id) File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/ceilometer/coordination.py", line 125, in join_group join_req.get() File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/tooz/drivers/zookeeper.py", line 427, in get return self._handler(self._kazoo_async_result, timeout, **self._kwargs) File "/opt/stack/venv/ceilometer-20150911T173109Z/lib/python2.7/site-packages/tooz/drivers/zookeeper.py", line 137, in _join_group_handler raise coordination.ToozError(utils.exception_message(e)) ToozError (kazoo.client): 2015-09-11 18:49:41,550 WARNING connection zk_loop Failed connecting to Zookeeper within the connection retry policy. (kazoo.client): 2015-09-11 18:49:41,551 INFO client _session_callback Zookeeper session lost, state: CLOSED (kazoo.client): 2015-09-11 18:49:41,551 Level 5 connection zk_loop Connection stopped (oslo_messaging._drivers.impl_rabbit): 2015-09-11 18:49:42,333 ERROR impl_rabbit _error_callback Failed to consume message from queue:
Ceilometer uses Tooz for agent coordination and configurable connection retries will be useful to build resilience against random connection failures.
For example i see this in notification agent logs:
(kazoo.client): 2015-09-11 18:49:35,331 DEBUG connection _submit Sending request(xid=2): Create( path=u' /tooz/ceilomete r.notification/ b279f2ed- fe04-4113- b374-4627745c71 1c', data='\xc4\x00', acl=[ACL(perms=31, acl_list=['ALL'], id=Id(scheme= 'world' , id='anyone'))], flags=1) openstack. common. threadgroup) : 2015-09-11 18:49:41,463 ERROR threadgroup wait venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ ceilometer/ openstack/ common/ threadgroup. py", line 145, in wait venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ ceilometer/ openstack/ common/ threadgroup. py", line 47, in wait venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ eventlet/ greenthread. py", line 175, in wait event.wait( ) venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ eventlet/ event.py" , line 121, in wait hub().switch( ) venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ eventlet/ hubs/hub. py", line 294, in switch switch( ) venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ eventlet/ greenthread. py", line 214, in main venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ ceilometer/ openstack/ common/ service. py", line 491, in run_service venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ ceilometer/ notification. py", line 143, in start partition_ coordinator. join_group( self.group_ id) venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ ceilometer/ coordination. py", line 125, in join_group venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ tooz/drivers/ zookeeper. py", line 427, in get self._kazoo_ async_result, timeout, **self._kwargs) venv/ceilometer -20150911T17310 9Z/lib/ python2. 7/site- packages/ tooz/drivers/ zookeeper. py", line 137, in _join_group_handler ToozError( utils.exception _message( e)) ._drivers. impl_rabbit) : 2015-09-11 18:49:42,333 ERROR impl_rabbit _error_callback Failed to consume message from queue:
(kazoo.client): 2015-09-11 18:49:38,485 Level 5 connection _submit Sending request(xid=-2): Ping()
(kazoo.client): 2015-09-11 18:49:41,450 WARNING connection _connect_attempt Connection dropped: outstanding heartbeat ping not received
(kazoo.client): 2015-09-11 18:49:41,450 WARNING connection _connect_attempt Transition to CONNECTING
(kazoo.client): 2015-09-11 18:49:41,450 INFO client _session_callback Zookeeper connection lost
(ceilometer.
Traceback (most recent call last):
File "/opt/stack/
x.wait()
File "/opt/stack/
return self.thread.wait()
File "/opt/stack/
return self._exit_
File "/opt/stack/
return hubs.get_
File "/opt/stack/
return self.greenlet.
File "/opt/stack/
result = function(*args, **kwargs)
File "/opt/stack/
service.start()
File "/opt/stack/
self.
File "/opt/stack/
join_req.get()
File "/opt/stack/
return self._handler(
File "/opt/stack/
raise coordination.
ToozError
(kazoo.client): 2015-09-11 18:49:41,550 WARNING connection zk_loop Failed connecting to Zookeeper within the connection retry policy.
(kazoo.client): 2015-09-11 18:49:41,551 INFO client _session_callback Zookeeper session lost, state: CLOSED
(kazoo.client): 2015-09-11 18:49:41,551 Level 5 connection zk_loop Connection stopped
(oslo_messaging