RabbitMQ cluster fault after pacemeker OCF parameter changed

Bug #1546286 reported by Andrii Petrenko
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
New
High
Unassigned
6.1.x
New
Critical
Unassigned

Bug Description

After update OCF parameters of rabbitmq cluster by command "crm configure edit p_rabbitmq-server"

from:

params node_port=5673 debug=false command_timeout="--signal=KILL" erlang_cookie=EOKOWXQREETZSHFNTPEY max_rabbitmqctl_timeouts=3 \

to:

params node_port=5673 debug=false command_timeout="--signal=KILL" erlang_cookie=EOKOWXQREETZSHFNTPEY max_rabbitmqctl_timeouts=5 \

Slaves of rabbitmq cluster has been shutdown by pacemaker and newer get up.

How to reproduce:

1. install environment with HA mode (3 controllers)
2. start rabbitmq service by pacemaker.
3. make sure that cluster in proper state: command "rabbitmqctl cluster_status" shows 3 nodes in cluster

root@node-1:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
 {running_nodes,['rabbit@node-2','rabbit@node-3','rabbit@node-1']},
 {cluster_name,<<"<email address hidden>">>},
 {partitions,[]}]
...done.

4. run "crm configure edit p_rabbitmq-server" on controller
5. change any value.
6. save config and exit from editor
7. in a minute check rabbbitmq cluster status using "rabbitmqctl cluster_status"

root@node-1:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
 {running_nodes,['rabbit@node-1']},
 {cluster_name,<<"<email address hidden>">>},
 {partitions,[]}]
...done.

8. in a 2 minutes check rabbitmq cluster status by command: pcs status

 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     p_rabbitmq-server (ocf::fuel:rabbitmq-server): FAILED node-2.domain.local
     p_rabbitmq-server (ocf::fuel:rabbitmq-server): FAILED node-3.domain.local

and then

 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     Masters: [ node-1.domain.local ]
     Stopped: [ node-2.domain.local node-3.domain.local ]

after restart

Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     Masters: [ node-1.domain.local ]
     Slaves: [ node-2.domain.local node-3.domain.local ]

root@node-1:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
 {running_nodes,['rabbit@node-2','rabbit@node-3','rabbit@node-1']},
 {cluster_name,<<"<email address hidden>">>},
 {partitions,[]}]

Andrii Petrenko (aplsms)
tags: added: customer-found support
Changed in mos:
importance: Undecided → High
Andrii Petrenko (aplsms)
no longer affects: mos/7.0.x
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.