ceilometer group partitioning coordination with tooz+redis+sentinel fails to failover to new master
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Ceilometer |
Fix Released
|
Undecided
|
Unassigned | |||
tooz | Status tracked in Kilo | |||||
Kilo |
Fix Released
|
Medium
|
Chris Dent | |||
Liberty |
Fix Released
|
Medium
|
Chris Dent |
Bug Description
When using tooz configured with multiple sentinels to coordinate group membership for the central (and other) agents the coordinator fails to update to use a new master redis server.
This appears to be happening because there's no retry logic when there is a ToozConnectionError which would (eventually) lead to tooz.driver.
There's a question about where the retry logic should go: in ceilometer.
When the redis sentinel code was first created there was a (now proven to be mistaken) belief that there already was retry logic in ceilometer. However since the sentinel handling is quite specific in the way it works, and tooz is a tool for lots of stuff besides ceilometer, it should probably go in there.
There are some (now out of date) notes that led to this discovery at: https:/
Changed in python-tooz: | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in python-tooz: | |
milestone: | none → 0.13.2 |
status: | Fix Committed → Fix Released |
Changed in python-tooz: | |
milestone: | 0.13.2 → 0.14.0 |
no longer affects: | ceilometer/kilo |
Changed in python-tooz: | |
milestone: | 0.14.0 → 0.13.2 |
no longer affects: | ceilometer/liberty |
Changed in ceilometer: | |
status: | New → Fix Released |
Fix proposed to branch: master /review. openstack. org/165890
Review: https:/