Steps To Reproduce:
Env with ISO 157 (3 controllers, 1 compute) was deployed and stay for a one night. After some unknown events rabbitmq cluster fell to a few pieces:
2015-08-12T15:26:41.960518+00:00 err: ERROR: p_rabbitmq-server: get_monitor(): rabbit node is running out of the cluster
2015-08-12T15:26:42.044585+00:00 err: ERROR: p_rabbitmq-server: get_monitor(): get_status() returns generic error 1
2015-08-12T15:26:42.098906+00:00 info: INFO: p_rabbitmq-server: get_monitor(): ensuring this slave does not get promoted.
2015-08-12T15:27:00.428028+00:00 info: INFO: p_rabbitmq-server: get_monitor(): CHECK LEVEL IS: 0
2015-08-12T15:27:02.419380+00:00 info: INFO: p_rabbitmq-server: get_monitor(): get_status() returns 0.
2015-08-12T15:27:02.497602+00:00 info: INFO: p_rabbitmq-server: get_monitor(): also checking if we are master.
2015-08-12T15:27:04.253156+00:00 info: INFO: p_rabbitmq-server: get_monitor(): master attribute is 1
2015-08-12T15:27:05.208975+00:00 info: INFO: p_rabbitmq-server: get_monitor(): checking if rabbit app is running
2015-08-12T15:27:05.226972+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running. checking if we are the part of healthy cluster
2015-08-12T15:27:05.301620+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running. looking for master on node-3.domain.tld
2015-08-12T15:27:05.374315+00:00 info: INFO: p_rabbitmq-server: get_monitor(): fetched master attribute for node-3.domain.tld. attr value is 1
2015-08-12T15:27:05.408902+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running. looking for master on node-1.domain.tld
2015-08-12T15:27:05.514029+00:00 info: INFO: p_rabbitmq-server: get_monitor(): fetched master attribute for node-1.domain.tld. attr value is 0
2015-08-12T15:27:05.522496+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running. master is node-1.domain.tld
2015-08-12T15:27:08.024116+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running. looking for master on node-2.domain.tld
2015-08-12T15:27:08.075369+00:00 info: INFO: p_rabbitmq-server: get_monitor(): fetched master attribute for node-2.domain.tld. attr value is 1
2015-08-12T15:27:08.099939+00:00 err: ERROR: p_rabbitmq-server: get_monitor(): rabbit node is running out of the cluster
2015-08-12T15:27:08.220098+00:00 err: ERROR: p_rabbitmq-server: get_monitor(): get_status() returns generic error 1
and didn't heal itself, so there are a lot of error messages in logs of other services
Here is an environment snapshot https://drive.google.com/file/d/0BzU7h7sQOuiqTG14UG5oM2FNbmc/view?usp=sharing (size is 635 MB)
Anastasia, please try to reproduce the issue once more and if it occurs provide us the environment.