Comment 39 for bug 1399272

Revision history for this message
Alexey Khivin (akhivin) wrote :

Thе reason of this http://paste.openstack.org/show/145627/ behaviour is a corosync script

Each time primary controller has shutdown, corosync tries to rebuild RabbitMQ cluster and firstly corosync kills (or sometime stops rabbit application but not beam process ) RabbitMQ on the others nodes. So, whole RabbitMQ cluster becomes unavailable for a several minutes. After several experiments I saw that sometimes RabbitMQ application was stopped by corsync on all nodes and after that whole RabbitMQ cluster became unavailable permanently (or for a too long period of time).