OpenStack-Ansible

keepalived haproxy check script too simple

Bug #1674742 reported by Chenjun Shen on 2017-03-21

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack-Ansible	Fix Released	Medium	Chenjun Shen

Bug Description

location: playbooks/vars/configs/keepalived_haproxy.yml

This part:

keepalived_scripts:
haproxy_check_script:
check_script: "killall -0 haproxy"

We found "killall -0 haproxy" is too simple, and might cause problem.

For example:
In our envs, we provide load balance as a service in neutron. In the neutron-agent container, there is also a haproxy process running there. Even we stop the haproxy on the control nodes, haproxy process in the container still lists in the ps output of control nodes.

Then "killall -0 haproxy" will still give exit code = 0, which makes keepalived not recognize that haproxy process is down.

We suggest to change the check script as below, which only checks the process running locally, not influenced by the one in the container:

keepalived_scripts:
haproxy_check_script:
check_script: "service haproxy status"

If this would be accepted, I could also provide a patch or whatever as wished.

Best regards,
Chenjun Shen

Tags:

Revision history for this message

Jean-Philippe Evrard (jean-philippe-evrard) wrote on 2017-03-28:

If collocating haproxies, that could indeed be a problem (which is gonna be happening with neutron metadata or LBaaS).

We could move keepalived and haproxy into containers, or change the check script to be namespaced. I haven't checked how "service haproxy status" works under systemd. Probably worth investigating too.

Revision history for this message

Chenjun Shen (cshen) wrote on 2017-03-28:

Hi Jean,

"service haproxy status" worked for my company under Ubuntu 16.04.

Alternative will be "kill -0 `cat /var/run/haproxy.pid`" under Ubuntu/Systemd, which also worked for us.

Jean-Philippe Evrard (jean-philippe-evrard) on 2017-03-28

Changed in openstack-ansible:
status:	New → Confirmed
importance:	Undecided → Medium
assignee:	nobody → Jean-Philippe Evrard (jean-philippe-evrard)
tags:	added: low-hanging-fruit

Revision history for this message

Chenjun Shen (cshen) wrote on 2017-03-29:

Patch submitted.

https://review.openstack.org/451409

Revision history for this message

Jean-Philippe Evrard (jean-philippe-evrard) wrote on 2017-03-30:

Thank you!

OpenStack Infra (hudson-openstack) on 2017-03-30

Changed in openstack-ansible:
assignee:	Jean-Philippe Evrard (jean-philippe-evrard) → Chenjun Shen (cshen)
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-03-30: Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/451409
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=6d757cade5fe4fcefa83ff27f134787e6881e6d2
Submitter: Jenkins
Branch: master

commit 6d757cade5fe4fcefa83ff27f134787e6881e6d2
Author: Chenjun Shen <email address hidden>
Date: Wed Mar 29 15:37:09 2017 +0200

Update haproxy check script in keepalived.

Fixes-Bug: 1674742

Change-Id: If4bdc9d8882ba902e14508e3c28a78b8c0f74b09

Changed in openstack-ansible:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-10: Fix proposed to openstack-ansible (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/463869

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-11: Fix merged to openstack-ansible (stable/ocata)

Reviewed: https://review.openstack.org/463869
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=3ac6ae0d4be619596a04c76d70bb7156df0beb95
Submitter: Jenkins
Branch: stable/ocata

commit 3ac6ae0d4be619596a04c76d70bb7156df0beb95
Author: Chenjun Shen <email address hidden>
Date: Wed Mar 29 15:37:09 2017 +0200

Update haproxy check script in keepalived.

Fixes-Bug: 1674742

Change-Id: If4bdc9d8882ba902e14508e3c28a78b8c0f74b09
(cherry picked from commit 6d757cade5fe4fcefa83ff27f134787e6881e6d2)

tags:

added: in-stable-ocata

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-20: Fix proposed to openstack-ansible (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/513814

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-22: Fix merged to openstack-ansible (stable/newton)

Reviewed: https://review.openstack.org/513814
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=0513d1cd6858d71fc50e65c14c987b0edfc128ae
Submitter: Zuul
Branch: stable/newton

commit 0513d1cd6858d71fc50e65c14c987b0edfc128ae
Author: Chenjun Shen <email address hidden>
Date: Wed Mar 29 15:37:09 2017 +0200

Update haproxy check script in keepalived.

Fixes-Bug: 1674742

Change-Id: If4bdc9d8882ba902e14508e3c28a78b8c0f74b09
(cherry picked from commit 6d757cade5fe4fcefa83ff27f134787e6881e6d2)

tags:

added: in-stable-newton

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #1689947

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.