Tooz coordination backend with redis not reachable - ToozConnectionError: Error 113 connecting to 172.16.2.5:6379. No route to host.

Bug #1553250 reported by Dan Prince
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
Medium
Unassigned

Bug Description

When using IPv4 network isolation I'm seeing the following aodh exceptions:

Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination [-] Error connecting to coordination backend.
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination Traceback (most recent call last):
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/aodh/coordination.py", line 104, in start
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination self._coordinator.start()
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 292, in start
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination self._start()
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 443, in _start
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination self._server_info = self._client.info()
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination self.gen.throw(type, value, traceback)
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 51, in _translate_failures
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination cause=e)
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 666, in raise_with_cause
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination excutils.raise_with_cause(exc_cls, message, *args, **kwargs)
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause'))
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination raise value
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination ToozConnectionError: Error 113 connecting to 172.16.2.5:6379. No route to host.
Mar 4 00:17:52 localhost aodh-evaluator: 2016-03-04 00:17:52.076 29045 ERROR aodh.coordination

Dan Prince (dan-prince)
Changed in tripleo:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

I have seen the same with ceilometer coordination:

2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination Traceback (most recent call last):
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 113, in heartbeat
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination self._coordinator.heartbeat()
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 507, in heartbeat
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination value=self.STILL_ALIVE)
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination self.gen.throw(type, value, traceback)
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 51, in _translate_failures
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination cause=e)
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 666, in raise_with_cause
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination excutils.raise_with_cause(exc_cls, message, *args, **kwargs)
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause'))
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination raise value
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination ToozConnectionError: Error 113 connecting to 192.0.2.7:6379. No route to host.
2016-03-01 17:15:52.914 21350 ERROR ceilometer.coordination
2016-03-01 17:15:52.915 21350 WARNING oslo.service.loopingcall [req-0a608ba0-7974-4aec-bf73-5f3655da1e88 admin - - - -] Function 'ceilometer.coordination.PartitionCoordinator.heartbeat' run outlasted interval by 10.95 sec
2016-03-01 17:15:58.924 21350 ERROR ceilometer.coordination [req-0a608ba0-7974-4aec-bf73-5f3655da1e88 admin - - - -] Error connecting to coordination backend.

tooz configuration seems to be the issue here?

summary: - aodh-evaluator: ToozConnectionError: Error 113 connecting to
- 172.16.2.5:6379. No route to host.
+ Tooz coordination backend with redis not reachable -
+ ToozConnectionError: Error 113 connecting to 172.16.2.5:6379. No route
+ to host.
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

Adding some notes:

Configuration on ceilometer side seem to be correct:

"ceilometer::agent::central::coordination_url": "redis://192.0.2.7:6379",

redis is binding on "redis::bind": "192.0.2.11"

redis vip is "redis_vip": "192.0.2.7"

from redis.log the service is definitely up:

7074:M 01 Mar 17:08:24.692 # Server started, Redis version 3.0.6
7074:M 01 Mar 17:08:24.692 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
7074:M 01 Mar 17:08:24.692 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
7074:M 01 Mar 17:08:24.692 * The server is now ready to accept connections on port 6379

so for some reason the vip is not reachable here?

Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

I think we can close this as this was due to connectivity issues rather than a bug.

Changed in tripleo:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.