Ensure public/management VIP is running on node where HAproxy is working

Bug #1320183 reported by Bartosz Kupidura on 2014-05-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Bartosz Kupidura
4.1.x
High
Bartosz Kupidura

Bug Description

Now if HAproxy dies, VIP is not moved to another node in a cluster.
Simple way to check this is (HAProxy can die after segfault, wrong config,
uninstalled package...):
# echo deadbeef >> /etc/haproxy/haproxy.cfg
# /etc/init.d/haproxy stop

What happens:
- Corosync can not start HAproxy
- Corosync will NOT move VIP to another node
- ALL connections to VIPs got 'connection refused'

What should happen:
- Corosync can not start HAproxy
- Corosync will move VIP to another node

Now ocf:mirantis:haproxy check only if haproxy is running, in future we can
implement more sophisticated health checks (backend timeouts, current connections limit...)

Bogdan Dobrelya (bogdando) wrote :
Changed in fuel:
milestone: none → 5.0
importance: Undecided → High
assignee: nobody → Bartosz Kupidura (zynzel)
status: New → In Progress
tags: added: ha

Reviewed: https://review.openstack.org/93884
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=fae4b414d1d831a9cedbcad51feb87e566758852
Submitter: Jenkins
Branch: master

commit fae4b414d1d831a9cedbcad51feb87e566758852
Author: Bartosz Kupidura <email address hidden>
Date: Fri May 16 11:52:28 2014 +0200

    Ensure public/management VIP is running on host where HAproxy is working.

    After 3 failures on given node, HAproxy will be stopped on that node.
    After 120s corosync will reset fail-count for this resource, and
    administrator can manually start resource again.

    If no manually intervention is made, cluster will start this resource
    after 15m (cluster-recheck-interval).

    Related-Bug: 1320183
    Change-Id: I0a236326a06bac79c91d8bab26298d7dccfb418f

Changed in fuel:
status: In Progress → Fix Committed
Bogdan Dobrelya (bogdando) wrote :

backports-4.1.1?

tags: added: backports-4.1.1
Changed in fuel:
milestone: 5.0 → 4.1.1
status: Fix Committed → In Progress

Reviewed: https://review.openstack.org/94168
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=66ea19ed507e188742e4a77697e8b9d4ba84b25a
Submitter: Jenkins
Branch: stable/4.1

commit 66ea19ed507e188742e4a77697e8b9d4ba84b25a
Author: Bartosz Kupidura <email address hidden>
Date: Fri May 16 11:52:28 2014 +0200

    Ensure public/management VIP is running on host where HAproxy is working.

    After 3 failures on given node, HAproxy will be stopped on that node.
    After 120s corosync will reset fail-count for this resource, and
    administrator can manually start resource again.

    If no manually intervention is made, cluster will start this resource
    after 15m (cluster-recheck-interval).

    Related-Bug: 1320183
    Change-Id: I0a236326a06bac79c91d8bab26298d7dccfb418f

Changed in fuel:
status: In Progress → Fix Committed
no longer affects: fuel/5.1.x
Changed in fuel:
milestone: 4.1.1 → 5.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers