The RabbitMQ failover test case is broken

Bug #1435254 reported by Bogdan Dobrelya
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Bogdan Dobrelya
5.1.x
Fix Committed
High
Bogdan Dobrelya
6.0.x
Fix Committed
High
Bogdan Dobrelya
6.1.x
Fix Committed
High
Bogdan Dobrelya

Bug Description

Related bug https://bugs.launchpad.net/fuel/+bug/1435250 provides test cases for "3-1" failover.
According to the test case #4 power on the destroyed node (old master), node-3:

Expected: no master reelection, failover w/o complete downtime - at least node-1 should process AMQP connections while cluster reassembling
(FAILED) Actual: master reelected, node-5: 2,5 minutes failover with complete downtime - no nodes can process AMQP connections:

The issue in OCF script logic should be fixed

Tags: ha rabbitmq
Changed in fuel:
importance: Undecided → High
assignee: nobody → Fuel Library Team (fuel-library)
milestone: none → 6.1
status: New → Confirmed
Alexey Khivin (akhivin)
tags: added: ha rabbitmq
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Bogdan Dobrelya (bogdando)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/169291

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/169291
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=2f20ebd0ae36ee7a58892da3aaa981b68ed6f7c7
Submitter: Jenkins
Branch: master

commit 2f20ebd0ae36ee7a58892da3aaa981b68ed6f7c7
Author: Bogdan Dobrelya <email address hidden>
Date: Mon Mar 30 17:27:51 2015 +0200

    Do not re-elect RabbitMQ multistate resource master

    Current monitor action for multistate RabbitMQ resource
    in Pacemaker may re-elect existing master when there is
    no need to do so. That is a problem as master elections
    introduce full cluster downtime and we don't want
    additional elections.

    The fix is to check whether the given node uptime is equal
    to the other nodes uptime and drop this node from the list
    of candidates if there is a master exists. Note, there is a
    special 'rabbit-master' attribute in CIB for this.

    Closes-bug: #1435254

    Change-Id: I6f2484fb8f3284f76461c184148d27aa86a5d4b6
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/6.0)

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/171192

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Let's make a HCF exception for the 6.0.1 backport

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.1)

Fix proposed to branch: stable/5.1
Review: https://review.openstack.org/171193

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/6.0)

Reviewed: https://review.openstack.org/171192
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=78edb8cf6f0524a9618d4e1e45790369f8dda16e
Submitter: Jenkins
Branch: stable/6.0

commit 78edb8cf6f0524a9618d4e1e45790369f8dda16e
Author: Bogdan Dobrelya <email address hidden>
Date: Mon Mar 30 17:27:51 2015 +0200

    Do not re-elect RabbitMQ multistate resource master

    Current monitor action for multistate RabbitMQ resource
    in Pacemaker may re-elect existing master when there is
    no need to do so. That is a problem as master elections
    introduce full cluster downtime and we don't want
    additional elections.

    The fix is to check whether the given node uptime is equal
    to the other nodes uptime and drop this node from the list
    of candidates if there is a master exists. Note, there is a
    special 'rabbit-master' attribute in CIB for this.

    Closes-bug: #1435254

    Change-Id: I6f2484fb8f3284f76461c184148d27aa86a5d4b6
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/171193
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=8b0de765435b7f3c73778ab3e10a98ac0f5c7e0c
Submitter: Jenkins
Branch: stable/5.1

commit 8b0de765435b7f3c73778ab3e10a98ac0f5c7e0c
Author: Bogdan Dobrelya <email address hidden>
Date: Mon Mar 30 17:27:51 2015 +0200

    Do not re-elect RabbitMQ multistate resource master

    Current monitor action for multistate RabbitMQ resource
    in Pacemaker may re-elect existing master when there is
    no need to do so. That is a problem as master elections
    introduce full cluster downtime and we don't want
    additional elections.

    The fix is to check whether the given node uptime is equal
    to the other nodes uptime and drop this node from the list
    of candidates if there is a master exists. Note, there is a
    special 'rabbit-master' attribute in CIB for this.

    Closes-bug: #1435254

    Change-Id: I6f2484fb8f3284f76461c184148d27aa86a5d4b6
    Signed-off-by: Bogdan Dobrelya <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.