@Artem,
this issue is floating and I've just got it on bare-metal lab after primary controller shutdown:
Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server] Masters: [ node-29.mirantis.com ] Slaves: [ node-28.mirantis.com node-35.mirantis.com ] Stopped: [ node-30.mirantis.com ]
Pacemaker says that RabbitMQ is running on node-35, but it's actually down:
root@node-35:~# ps auxfw | grep [r]abbit rabbitmq 7332 0.0 0.0 90832 12956 ? Ss 08:58 0:03 /usr/bin/python /usr/bin/rabbit-fence.py root@node-35:~# rabbitmqctl cluster_status Cluster status of node 'rabbit@node-35' ... Error: unable to connect to node 'rabbit@node-35': nodedown
rabbitmqctl cluster_status Cluster status of node 'rabbit@node-29' ... [{nodes,[{disc,['rabbit@node-28','rabbit@node-29','rabbit@node-35']}]}, {running_nodes,['rabbit@node-29']}, {cluster_name,<<"<email address hidden>">>}, {partitions,[]}]
There are no issues with server resources (most of controllers have 16+ GB RAM, 8 CPUs and SSD drives): http://paste.openstack.org/show/472715/
Also, the fix https://review.openstack.org/#/c/223548 was merged to master (8.0) only, the patch for 7.0 https://review.openstack.org/#/c/223552/ is still on review.
@Artem,
this issue is floating and I've just got it on bare-metal lab after primary controller shutdown:
Master/Slave Set: master_ p_rabbitmq- server [p_rabbitmq-server] mirantis. com ] mirantis. com node-35. mirantis. com ] mirantis. com ]
Masters: [ node-29.
Slaves: [ node-28.
Stopped: [ node-30.
Pacemaker says that RabbitMQ is running on node-35, but it's actually down:
root@node-35:~# ps auxfw | grep [r]abbit rabbit- fence.py
rabbitmq 7332 0.0 0.0 90832 12956 ? Ss 08:58 0:03 /usr/bin/python /usr/bin/
root@node-35:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-35' ...
Error: unable to connect to node 'rabbit@node-35': nodedown
rabbitmqctl cluster_status [{disc, ['rabbit@ node-28' ,'rabbit@ node-29' ,'rabbit@ node-35' ]}]}, nodes,[ 'rabbit@ node-29' ]}, name,<< "<email address hidden>">>},
Cluster status of node 'rabbit@node-29' ...
[{nodes,
{running_
{cluster_
{partitions,[]}]
There are no issues with server resources (most of controllers have 16+ GB RAM, 8 CPUs and SSD drives): http:// paste.openstack .org/show/ 472715/
Also, the fix https:/ /review. openstack. org/#/c/ 223548 was merged to master (8.0) only, the patch for 7.0 https:/ /review. openstack. org/#/c/ 223552/ is still on review.