check_crm doesn't get standby nodes anymore
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack HA Cluster Charm |
Fix Committed
|
Undecided
|
Gabriel Cocenza |
Bug Description
The LP#1880576 (https:/
Deploying a OpenStack charm with HAcluster, like keystone, and pausing a hacluster unit is giving again critical message because the node is stopped.
Debugging why this was happening again after the bug fix, I discovered that the regex to find standby units doesn't work anymore, probably because of changes on pacemaker on how to display the content from crm_mon. The output looks like this:
$ sudo crm_mon -1rf
Cluster Summary:
* Stack: corosync
* Current DC: juju-91747d-
* Last updated: Mon May 2 17:01:04 2022
* Last change: Mon May 2 16:25:14 2022 by root via crm_attribute on juju-91747d-
* 3 nodes configured
* 4 resource instances configured
Node List:
* Node juju-91747d-
* Node juju-91747d-
* Online: [ juju-91747d-
Full List of Resources:
* Clone Set: cl_ks_haproxy [res_ks_haproxy]:
* Started: [ juju-91747d-
* Stopped: [ juju-91747d-
* res_ks_8402e19_vip (ocf::heartbeat
Migration Summary:
The regex is like this right now: $line =~ m/^node\
This expects the line starting with "node" that doesn't happen anymore and will give false alerts again.
Changing the regex to $line =~ m/\s*node\
Changed in charm-hacluster: | |
status: | New → In Progress |
Changed in charm-hacluster: | |
assignee: | nobody → Gabriel Angelo Sgarbi Cocenza (gabrielcocenza) |
Reviewed: https:/ /review. opendev. org/c/openstack /charm- hacluster/ +/840227 /opendev. org/openstack/ charm-hacluster /commit/ a0b419519cd438a ffb24ff80c0221c c33d884c9a
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit a0b419519cd438a ffb24ff80c0221c c33d884c9a
Author: Gabriel Cocenza <email address hidden>
Date: Mon May 2 19:17:36 2022 -0300
Fix standby node regex for check_crm
Pacemaker has changed the output format of crm_mon and this broke
the regex to catch nodes that are on standby mode. This change
updates the regex for not alerting on paused units.
Change-Id: I137acad076bff5 8506fea6e1618a0 0765adacd9b
Closes-Bug: #1971182
Related-Bug: #1880576