newer rabbitmq versions deprecate clusterer

Bug #1691759 reported by Vladislav Belogrudov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Undecided
Unassigned
Rocky
Fix Released
Undecided
Unassigned
kolla-ansible
Fix Released
Medium
Unassigned

Bug Description

Since 3.6.7 release clustering is managed via core and the formerly used clusterer plugin is deprecated. See https://github.com/rabbitmq/rabbitmq-clusterer/blob/master/README.md#project-status

Also previous versions of rabbits have CVEs fixed in the later releases. We need to move to original clustering again

Changed in kolla-ansible:
status: New → Confirmed
importance: Undecided → Wishlist
importance: Wishlist → Medium
Revision history for this message
Paul Bourke (pauldbourke) wrote :

Just spent a few hours looking at this in more detail. It seems at least for now, the migration path from rabbitmq-clusterer is not straight forward. Unfortunately the plugin docs don't give much info on why the plugin is deprecated, other than linking to the existing rabbitmq clustering docs.

According to the docs, the plugin was created to solve two issues:

1) declarative cluster configuration
2) arbitrary node restart

From what I can tell, while 3.6.7 seems to have taken steps to address 1), 2) still seems to be lacking:

"When the entire cluster is brought down, the last node to go down must be the first node to be brought online. If this doesn't happen, the nodes will wait 30 seconds for the last disc node to come back online, and fail afterwards. If the last node to go offline cannot be brought back up, it can be removed from the cluster using the forget_cluster_node command - consult the rabbitmqctl manpage for more information." [0]

and,

"While not strictly necessary, it is a good idea to decide ahead of time which disc node will be the upgrader, stop that node last, and start it first. Otherwise changes to the cluster configuration that were made between the upgrader node stopping and the last node stopping will be lost." [1]

This kind of co-ordination is non trivial in Ansible.

It seems further steps are being taken to address this for the upcoming 3.7.0, where more complete auto clustering will be added mimic that in rabbitmq-autocluster[2]. Given the importance of a stable rabbitmq implementation for Kolla deployments, suggest we stick with clusterer and reevaluate when 3.7.0 is released.

[0] http://www.rabbitmq.com/clustering.html
[1] http://www.rabbitmq.com/clustering.html#upgrading
[2] https://groups.google.com/forum/#!msg/rabbitmq-users/Jwh0y2PRSy4/Nb4v7OhfAwAJ

Revision history for this message
Mark Goddard (mgoddard) wrote :

Addressed by https://review.openstack.org/#/c/584426/ in the Rocky release.

Changed in kolla:
status: New → Fix Released
Changed in kolla-ansible:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.