Ensure public/management VIP is running on node where HAproxy is working

Bug #1320183 reported by Bartosz Kupidura
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Bartosz Kupidura
4.1.x
Fix Committed
High
Bartosz Kupidura

Bug Description

Now if HAproxy dies, VIP is not moved to another node in a cluster.
Simple way to check this is (HAProxy can die after segfault, wrong config,
uninstalled package...):
# echo deadbeef >> /etc/haproxy/haproxy.cfg
# /etc/init.d/haproxy stop

What happens:
- Corosync can not start HAproxy
- Corosync will NOT move VIP to another node
- ALL connections to VIPs got 'connection refused'

What should happen:
- Corosync can not start HAproxy
- Corosync will move VIP to another node

Now ocf:mirantis:haproxy check only if haproxy is running, in future we can
implement more sophisticated health checks (backend timeouts, current connections limit...)

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Changed in fuel:
milestone: none → 5.0
importance: Undecided → High
assignee: nobody → Bartosz Kupidura (zynzel)
status: New → In Progress
tags: added: ha
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/93884
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=fae4b414d1d831a9cedbcad51feb87e566758852
Submitter: Jenkins
Branch: master

commit fae4b414d1d831a9cedbcad51feb87e566758852
Author: Bartosz Kupidura <email address hidden>
Date: Fri May 16 11:52:28 2014 +0200

    Ensure public/management VIP is running on host where HAproxy is working.

    After 3 failures on given node, HAproxy will be stopped on that node.
    After 120s corosync will reset fail-count for this resource, and
    administrator can manually start resource again.

    If no manually intervention is made, cluster will start this resource
    after 15m (cluster-recheck-interval).

    Related-Bug: 1320183
    Change-Id: I0a236326a06bac79c91d8bab26298d7dccfb418f

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

backports-4.1.1?

tags: added: backports-4.1.1
Changed in fuel:
milestone: 5.0 → 4.1.1
status: Fix Committed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/4.1)

Related fix proposed to branch: stable/4.1
Review: https://review.openstack.org/94168

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/4.1)

Reviewed: https://review.openstack.org/94168
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=66ea19ed507e188742e4a77697e8b9d4ba84b25a
Submitter: Jenkins
Branch: stable/4.1

commit 66ea19ed507e188742e4a77697e8b9d4ba84b25a
Author: Bartosz Kupidura <email address hidden>
Date: Fri May 16 11:52:28 2014 +0200

    Ensure public/management VIP is running on host where HAproxy is working.

    After 3 failures on given node, HAproxy will be stopped on that node.
    After 120s corosync will reset fail-count for this resource, and
    administrator can manually start resource again.

    If no manually intervention is made, cluster will start this resource
    after 15m (cluster-recheck-interval).

    Related-Bug: 1320183
    Change-Id: I0a236326a06bac79c91d8bab26298d7dccfb418f

Changed in fuel:
status: In Progress → Fix Committed
no longer affects: fuel/5.1.x
Changed in fuel:
milestone: 4.1.1 → 5.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.