Masking haproxy.service makes the haproxy RA unable to detect failures

Bug #1853443 reported by Andrea Ieri
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron API Charm
Triaged
High
Unassigned

Bug Description

On a recently deployed cloud using 19.10 charms, I paused the neutron-api unit holding the vip. This masked and stopped the haproxy service, but since the haproxy resource agent (RA) did not report the failure, Pacemaker did not take corrective actions and left the VIP in place. As a consequence, no neutron API command could be submitted anymore.

This can also be easily reproduced manually:

$ /etc/init.d/haproxy status &>/dev/null; echo $?
0
$ systemctl status haproxy &>/dev/null; echo $?
0
$ systemctl mask haproxy
Created symlink /etc/systemd/system/haproxy.service → /dev/null.
$ /etc/init.d/haproxy status &>/dev/null; echo $?
0
$ systemctl status haproxy &>/dev/null; echo $?
0
$ systemctl stop haproxy
$ /etc/init.d/haproxy status &>/dev/null; echo $?
0
$ systemctl status haproxy &>/dev/null; echo $?
3

Digging further, the problem comes from /lib/lsb/init-functions.d/40-systemd:

$ bash -x /etc/init.d/haproxy status
+ PATH=/sbin:/usr/sbin:/bin:/usr/bin
+ BASENAME=haproxy
+ PIDFILE=/var/run/haproxy.pid
+ CONFIG=/etc/haproxy/haproxy.cfg
+ HAPROXY=/usr/sbin/haproxy
+ RUNDIR=/run/haproxy
+ EXTRAOPTS=
+ test -x /usr/sbin/haproxy
+ '[' -e /etc/default/haproxy ']'
+ . /etc/default/haproxy
++ ENABLED=1
+ test -f /etc/haproxy/haproxy.cfg
+ '[' -f /etc/default/rcS ']'
+ . /lib/lsb/init-functions
+++ run-parts --lsbsysinit --list /lib/lsb/init-functions.d
++ for hook in $(run-parts --lsbsysinit --list /lib/lsb/init-functions.d 2>/dev/null)
++ '[' -r /lib/lsb/init-functions.d/20-left-info-blocks ']'
++ . /lib/lsb/init-functions.d/20-left-info-blocks
++ for hook in $(run-parts --lsbsysinit --list /lib/lsb/init-functions.d 2>/dev/null)
++ '[' -r /lib/lsb/init-functions.d/40-systemd ']'
++ . /lib/lsb/init-functions.d/40-systemd
+++ _use_systemctl=0
+++ '[' -d /run/systemd/system ']'
+++ prog=haproxy
+++ service=haproxy.service
++++ systemctl -p LoadState --value show haproxy.service
+++ state=masked
+++ '[' masked = masked ']'
+++ exit 0

root@juju-27733d-1-lxd-13:~# grep -n '"masked"' /lib/lsb/init-functions.d/40-systemd
13: [ "$state" = "masked" ] && exit 0

The code above makes the haproxy RA not LSB-compliant[0]. If the RA decides not to investigate a masked service, it should - I believe - at least return 4.

I think that either the haproxy initscript should be made LSB-compliant, or we should switch to a different resource agent altogether (e.g. an ocf one).

This bug has been opened for neutron-api, but will affect any charm using the haproxy LSB resource agent.

Please also note that the API failure described initially would still occur even if this bug were to be resolved, due to LP#1810918. [1]

[0] https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
[1] https://bugs.launchpad.net/charm-neutron-api/+bug/1810918

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

TRIAGE: High because stopping a service using Pause actually breaks the system.

description: updated
Changed in charm-neutron-api:
assignee: nobody → Alex Kavanagh (ajkavanagh)
importance: Undecided → Medium
status: New → Triaged
importance: Medium → High
Changed in charm-neutron-api:
assignee: Alex Kavanagh (ajkavanagh) → nobody
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.