Openstack HA , rabbitmq-server is failing when network is isolated

Bug #1453241 reported by venu kolli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Medium
Sanju Abraham
Trunk
Fix Committed
Medium
Sanju Abraham

Bug Description

Rabbitmq is failing when all nodes in network are isolated

This issue is observed in 2.2 build 9

Openstack HA cluster is running fine .

Isolate all 3 openstack nodes

Bring back the network after 5min

rabbitmq-server on one of the node went to a bad state .

Supervisor which tried to restart rabbit went gave up after unsuccessful attempts

Tags: ha
venu kolli (vkolli)
Changed in juniperopenstack:
assignee: nobody → Sanju Abraham (asanju)
importance: Undecided → Critical
information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10391
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/10392
Submitter: Sanju (<email address hidden>)

Revision history for this message
Sanju Abraham (asanju) wrote :

Fixed provided by way of monitoring RMQ server and restarting it if supervisor has exhausted the retries and marked the service as EXITED / FATAL.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10393
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10392
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10393
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10393
Committed: http://github.org/Juniper/contrail-provisioning/commit/1bdfe3a7344c7e74eaf0ed1ba54202cf64491129
Submitter: Zuul
Branch: master

commit 1bdfe3a7344c7e74eaf0ed1ba54202cf64491129
Author: Sanju Abraham <email address hidden>
Date: Thu May 14 18:17:52 2015 -0700

Close-Bug: #1453241. Fix address the issue of supervisor marking RabbitMQ as EXITED / FATAL after restarts set number of retries. This is seen in network failure cases and rabbitmq sync failure

Change-Id: Idb910c7e91f1c7de73cccb99fb1fd06ddc9dd2a2

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/10392
Committed: http://github.org/Juniper/contrail-provisioning/commit/8a9cdd519821df922e476a1a00f304b6113d2cc3
Submitter: Zuul
Branch: R2.20

commit 8a9cdd519821df922e476a1a00f304b6113d2cc3
Author: Sanju Abraham <email address hidden>
Date: Thu May 14 18:10:32 2015 -0700

Close-Bug: #1453241. Fix address the issue of supervisor marking RabbitMQ as EXITED / FATAL after restarts set number of retries. This is seen in network failure cases and rabbitmq sync failure

Change-Id: I97bb71f26355726278beaef0019d4f267fae4f48

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.