Malformed 3 unit cluster (rabbitmq)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Landscape Server |
Invalid
|
Undecided
|
Unassigned | ||
OpenStack RabbitMQ Server Charm |
Fix Released
|
Critical
|
David Ames | ||
rabbitmq-server (Juju Charms Collection) |
Invalid
|
Critical
|
David Ames |
Bug Description
juju 2.1b4
cs:xenial/
maas 2.1.3
I have an HA openstack deployment done by the autopilot where the 3 rabbit units didn't cluster together. In fact, it looks like units 0 and 1 clustered, but unit 2 went ahead on its own (split brain then I suppose). Also of note is that unit 2 is the leader according to juju.
This was noticed when neutron services couldn't connect to rabbit, getting a 403 error back:
2017-01-17 12:30:22.929 32573 ERROR oslo_service.
This attempt can be confirmed in the rabbit/1 unit logs:
=ERROR REPORT==== 17-Jan-
closing AMQP connection <0.16879.0> (10.96.22.27:60030 -> 10.96.22.56:5672):
{handshake_
In fact, rabbit/0 and /1 show all sorts of refused logins because of invalid credentials.
Meanwhile, logs for rabbit/2 show that it is happily creating those users, like neutron:
=INFO REPORT==== 17-Jan-
Creating user 'neutron'
Note that there is suspicion that leader election in juju 2.1b4 broke or changed, see details in https:/
Attached are the logs for all 3 rabbit units, as well as the neutron "victim". I have logs of all nodes participating in this deployment if something else is needed.
Changed in rabbitmq-server (Juju Charms Collection): | |
status: | New → In Progress |
importance: | Undecided → Critical |
assignee: | nobody → David Ames (thedac) |
milestone: | none → 17.01 |
Changed in landscape: | |
milestone: | none → 17.01 |
Changed in rabbitmq-server (Juju Charms Collection): | |
status: | In Progress → Fix Committed |
Changed in landscape: | |
milestone: | 17.01 → 17.02 |
Changed in charm-rabbitmq-server: | |
assignee: | nobody → David Ames (thedac) |
importance: | Undecided → Critical |
status: | New → Fix Committed |
Changed in rabbitmq-server (Juju Charms Collection): | |
status: | Fix Committed → Invalid |
Changed in charm-rabbitmq-server: | |
milestone: | none → 17.02 |
Changed in charm-rabbitmq-server: | |
status: | Fix Committed → Fix Released |
summary: |
- Malformed 3 unit cluster + Malformed 3 unit cluster (rabbitmq) |
/etc/* and /var/log/* from all 3 rabbit units