Comment 13 for bug 1793269

Revision history for this message
Sathish Holla (sathishholla) wrote :

This looks like a case of rabbitMQ race condition during initialization.

As part of test case, the docker service was restarted on all 3 controller nodes.
After docker restart, When the rabbitmq service is trying to come up, there's a race condition between two rabbitMQ nodes and both end up being master nodes.
Due to this, there is a rabbitMQ cluster partition.

This is a known rabbitMQ bug and rabbitMQ proposes the following workaround to handle such cases:
https://github.com/rabbitmq/rabbitmq-server/issues/1202

To implement the above workaround in Contrail, we will need to update the current rabbitMQ version from 3.6 to 3.7.

Thanks,
Sathish