check_crm doesn't get standby nodes anymore

Bug #1971182 reported by Gabriel Cocenza
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack HA Cluster Charm
Fix Committed
Undecided
Gabriel Cocenza

Bug Description

The LP#1880576 (https://bugs.launchpad.net/charm-hacluster/+bug/1880576) added the possibility of don't alert on paused units by using the flag -s.

Deploying a OpenStack charm with HAcluster, like keystone, and pausing a hacluster unit is giving again critical message because the node is stopped.

Debugging why this was happening again after the bug fix, I discovered that the regex to find standby units doesn't work anymore, probably because of changes on pacemaker on how to display the content from crm_mon. The output looks like this:

$ sudo crm_mon -1rf
Cluster Summary:
  * Stack: corosync
  * Current DC: juju-91747d-haproxy-1 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Mon May 2 17:01:04 2022
  * Last change: Mon May 2 16:25:14 2022 by root via crm_attribute on juju-91747d-haproxy-0
  * 3 nodes configured
  * 4 resource instances configured

Node List:
  * Node juju-91747d-haproxy-0: standby
  * Node juju-91747d-haproxy-1: standby
  * Online: [ juju-91747d-haproxy-2 ]

Full List of Resources:
  * Clone Set: cl_ks_haproxy [res_ks_haproxy]:
    * Started: [ juju-91747d-haproxy-2 ]
    * Stopped: [ juju-91747d-haproxy-0 juju-91747d-haproxy-1 ]
  * res_ks_8402e19_vip (ocf::heartbeat:IPaddr2): Started juju-91747d-haproxy-2

Migration Summary:

The regex is like this right now: $line =~ m/^node\s+(\S.*):\s*standby/i
This expects the line starting with "node" that doesn't happen anymore and will give false alerts again.

Changing the regex to $line =~ m/\s*node\s+(\S.*):\s*standby/i might solve the problem.

Changed in charm-hacluster:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (master)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/840227
Committed: https://opendev.org/openstack/charm-hacluster/commit/a0b419519cd438affb24ff80c0221cc33d884c9a
Submitter: "Zuul (22348)"
Branch: master

commit a0b419519cd438affb24ff80c0221cc33d884c9a
Author: Gabriel Cocenza <email address hidden>
Date: Mon May 2 19:17:36 2022 -0300

    Fix standby node regex for check_crm

    Pacemaker has changed the output format of crm_mon and this broke
    the regex to catch nodes that are on standby mode. This change
    updates the regex for not alerting on paused units.

    Change-Id: I137acad076bff58506fea6e1618a00765adacd9b
    Closes-Bug: #1971182
    Related-Bug: #1880576

Changed in charm-hacluster:
status: In Progress → Fix Committed
Changed in charm-hacluster:
assignee: nobody → Gabriel Angelo Sgarbi Cocenza (gabrielcocenza)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-hacluster (stable/focal)

Fix proposed to branch: stable/focal
Review: https://review.opendev.org/c/openstack/charm-hacluster/+/841589

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-hacluster (stable/jammy)

Fix proposed to branch: stable/jammy
Review: https://review.opendev.org/c/openstack/charm-hacluster/+/841590

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (stable/jammy)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/841590
Committed: https://opendev.org/openstack/charm-hacluster/commit/da1e0ff22b9e3960acf77de13f3637fe63873b3a
Submitter: "Zuul (22348)"
Branch: stable/jammy

commit da1e0ff22b9e3960acf77de13f3637fe63873b3a
Author: Gabriel Cocenza <email address hidden>
Date: Mon May 2 19:17:36 2022 -0300

    Fix standby node regex for check_crm

    Pacemaker has changed the output format of crm_mon and this broke
    the regex to catch nodes that are on standby mode. This change
    updates the regex for not alerting on paused units.

    Change-Id: I137acad076bff58506fea6e1618a00765adacd9b
    Closes-Bug: #1971182
    Related-Bug: #1880576
    (cherry picked from commit a0b419519cd438affb24ff80c0221cc33d884c9a)

tags: added: in-stable-jammy
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (stable/focal)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/841589
Committed: https://opendev.org/openstack/charm-hacluster/commit/7eba2bfeb0059fcaeb8cca3bd9526cb18debc2c5
Submitter: "Zuul (22348)"
Branch: stable/focal

commit 7eba2bfeb0059fcaeb8cca3bd9526cb18debc2c5
Author: Gabriel Cocenza <email address hidden>
Date: Mon May 2 19:17:36 2022 -0300

    Fix standby node regex for check_crm

    Pacemaker has changed the output format of crm_mon and this broke
    the regex to catch nodes that are on standby mode. This change
    updates the regex for not alerting on paused units.

    Change-Id: I137acad076bff58506fea6e1618a00765adacd9b
    Closes-Bug: #1971182
    Related-Bug: #1880576
    (cherry picked from commit a0b419519cd438affb24ff80c0221cc33d884c9a)

tags: added: in-stable-focal
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.