Comment 26 for bug 1472230

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

User impact is following: after failover of some controller node, RabbitMQ cluster could be rebuilt and run without some live controllers, which means that high availability of AMQP could be broken. For example:

1) Cloud has 5 controller nodes
2) One controller node goes down
3) RabbitMQ cluster re-assembles, but service is running only on one controller
4) Controller node with alive RabbitMQ goes down

Result: AMQP messages are lost, some cloud operations failed

Diagnostic snapshot: https://drive.google.com/file/d/0BzaZINLQ8-xkanF2Z3cxYVljVVU/view?usp=sharing