kolla-ansible

Bug #1954925
Comment #4

Comment 4 for bug 1954925

Revision history for this message

John Garbutt (johngarbutt) wrote on 2021-12-15:

I also think we are using a bad HA setting, we should think about:
{"ha-mode":"exactly","ha-params":2}

The reference for that is this:
https://www.rabbitmq.com/ha.html#replication-factor

My theory being, that means the transient queues we create for the rpc call response queues are less likely to be an issue, as we will have less rabbitmq load.

Interesting, openstack-ansible does this:
https://github.com/openstack/openstack-ansible-rabbitmq_server/blob/34819a10ace4f800d20c2d36035bbfca3ab9671e/defaults/main.yml#L275
rabbitmq_openstack_policies:
  - name: "HA"
    pattern: '^(?!(amq\.)|(.*_fanout_)|(reply_)).*'
    tags: "ha-mode=all"

And tripple-o does:
https://github.com/openstack/puppet-tripleo/blob/fdca31a2009a0aaf3f3ee9c5e30083ac59bf067f/manifests/profile/pacemaker/rabbitmq_bundle.pp#L344

ha-all ^(?!amq\.).* queues {"ha-mode":"exactly","ha-params":2,"ha-promote-on-shutdown":"always"} 0

I think following openstack-ansible is a good idea here, more on what they are doing here:
https://github.com/openstack/openstack-ansible-rabbitmq_server/commit/52ad552129afc715dc978c61edf881090fcf48c0
Not just because the commit came from one of the creators of rabbitmq :)

The fix was raised in an oslo meeting it turns out.