There are cases when masakari-hostmonitor will recognize online nodes as offline and send (in)appropriate notifications to Masakari

Bug #1878548 reported by Daisuke Suzuki on 2020-05-14
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
masakari-monitors
High
Daisuke Suzuki

Bug Description

[Issue]
ComputeNodes are managed by pacemaker_remote in my environment.
When one ComputeNode is isolated in the network, masakari-hostmonitors on the other ComputeNodes will send failure notification about the isolated ComputeNode to masakari-api.
At that time, the isolated masakari-hostomonitor will recognize other ComputeNodes as offline. So it sends failure notification about online ComputeNodes.
As a result, masakari-engine runs the recovery procedure to online ComputeNodes.

[Cause]
The current masakari-hostmonitor can't determine whether or not it is isolated in the network if ComputeNodes are managed by pacemaker_remote.

masakari-hostmonitor with pacemaker(not remote) will wait until it is killed if it is isolated in the network. It is implemented in the following code.
<https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L398-L402>

But masakari-hostmonitor with pacemaker_remote won't determine if it is isolated.
<https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L93-L95>

[Solution]
The ComputeNode managed by pacemaker_remote should determine recognize itself as offline when it is isolated.
The state monitoring process should be skipped in that case.

See comment #11 for how yoctozepto managed to reproduce something similar to the described.

Changed in masakari-monitors:
assignee: nobody → Daisuke Suzuki (suzuki-di)
status: New → In Progress
Radosław Piliszek (yoctozepto) wrote :

_check_host_status_by_crmadmin [1] is the proper safeguard.
Hostmonitor should be treated as pacemaker proxy so should run on pacemaker nodes (not remotes).
I guess this needs documenting and disabling its functionality on non-pacemaker nodes altogether.
There is no benefit to running hostmonitors on remotes, it can only result in more resource waste and less stability.

[1] https://opendev.org/openstack/masakari-monitors/src/commit/b02c6b6931c0256f4ce6d7167c97ebb849ff3453/masakarimonitors/hostmonitor/host_handler/handle_host.py#L414-L418

Radosław Piliszek (yoctozepto) wrote :

I could not reproduce. An isolated remote node fails its crm_mon invocation and prevents hostmonitor from acting at all.

crm_mon logs:

Error: cluster is not available on this node

pacemaker_remoted logs:

warning: Cannot proxy request from uid 0 gid 0 because not connected to cluster
  error: Error in connection setup (/dev/shm/qb-1050-23133-15-M2JrMP/qb): Remote I/O error (121)

Changed in masakari-monitors:
status: In Progress → Incomplete
Daisuke Suzuki (suzuki-di) wrote :

> I could not reproduce. An isolated remote node fails its crm_mon invocation and prevents hostmonitor from acting at all.

You need to install crmsh on the remote node.
If crmsh is installed on the remote node, hostmonitor can execute the crm_mon command and monitor each remote node's status.
Then, I think you can reproduce this issue.
Please let me know your opinion on this.

Radosław Piliszek (yoctozepto) wrote :

Hah, crmsh is out of availability on CentOS 8.

Thanks for the info. I'll try PoCing on Ubuntu 20.04.

It's intriguing that crmsh would change the cluster behaviour (that's pretty dangerous).

Could you share the exact instructions on how you set up your cluster to achieve this?

Radosław Piliszek (yoctozepto) wrote :

Ping. Daisuke?

Daisuke Suzuki (suzuki-di) wrote :

In our environment, we set up the cluster by the following steps.

1. Prepare Controller Node (1 or more) and Compute Node (3 or more).
2. Install corosync, pacemaker, crmsh, masakari-api, masakari-engine on the Controller Node.
3.Install pacemaker_remote, crmsh, masakari-hostmonitor[1] on Compute Nodes.
4. Manage pacemaker_remote cluster on Compute Nodes by the pacemaker on the Controller Node.[2]

[1]
In our environment, we deployed a masakari-hostmonitor on Cpmpute Nodes. But you can also deploy it on Controller Nodes.

[2]
In order to manage the pacemaker_remote cluster of Compute Nodes, set remote RA related to each Compute Node in crm.

This is the pacemaker-remote RA settings. You should set for all Compute Nodes managed by pacemaker-remote cluster.

-----
     primitive <defined name of remote node> ocf:pacemaker:remote \
      params reconnect_interval=10 server=<Host name or IP address of the Compute Node> \
      op migrate_from interval=0s timeout=60000 \
      op migrate_to interval=0s timeout=60000 \
      op monitor interval=20s timeout=180000 \
      op reload interval=0s timeout=60000 \
      op start interval=0s timeout=60000 \
      op stop interval=0s timeout=60000
------

Radosław Piliszek (yoctozepto) wrote :

Thanks.

I would like the crmsh invocations as well (never used it in fact, only pcs and raw).

Radosław Piliszek (yoctozepto) wrote :

Hmm, it seems the link to the patch is not present in the thread so hereby I am posting it now:

https://review.opendev.org/c/openstack/masakari-monitors/+/729206

My primary concern with the patch is that it adds extra complexity and the deployment discussed here is simply not recommended (also due to performance and stability reasons) - hostmonitors should be placed on cluster nodes, not remotes.
Finally, I was unable (at the time) to spin up a local reproducer. I suppose this is strongly related to the usage of crmsh doing its extra magic.

Radosław Piliszek (yoctozepto) wrote :

I invite you to join our Masakari meeting to discuss this: https://wiki.openstack.org/wiki/Meetings/Masakari

I was unable to reproduce this (tried again with crmsh to be 100% fair). There must be something really peculiar about your Pacemaker+Corosync config. crmsh does not affect the outcome here. The isolated node is unable to provide the crm_mon output and thus unable to act with hostmonitor. The way pacemaker-remote is wired is that it cannot know cluster info if it is not online in that cluster. If it is not the case, I would suspect there is some specific config (or perhaps Pacemaker/Corosync version?) in place that misbehaves. All in all, I will be *deprecating support for running hostmonitors on remotes*. The reasoning is simple - it brings no benefits, only complexities. The cluster has to be contacted. The hostmonitors act like controller services, proxying the Pacemaker info into Masakari.

Changed in masakari-monitors:
status: Incomplete → Invalid

OK, I managed to reproduce this (or close to this) issue... but in a different setup than what I understood.
Do note the scenario is very artificial so it is very unlikely to happen in real life (but something similar could still...).

Here is the setup I managed to cause Masakari trash all the hosts:

3 controllers (all APIs, DBs, Masakari Engine and Masakari hostmonitor)
some computes, all running pacemaker_remote

hostmonitors configured to monitor only remotes (as all computes are remotes here)

Blocking corosync traffic to one of the controllers makes it become isolated and lose quorum and think all the other nodes are offline. Hostmonitor is happy to tell Masakari all remote nodes are offline...

Changed in masakari-monitors:
status: Invalid → Triaged
importance: Undecided → High

Also, the proposed fix does not help in the situation I described. The local node is always 'online' (not to mention non-remotes are filtered out when restricting to remotes). What's more, the "non-restricted" version is broken as well as it does not react properly on the lack of quorum...

(And, finally, I have found a bunch of other, perhaps lesser, issues with the hostmonitor. All thanks to deep debugging and code analysis to review feature patches.)

summary: - There are cases when masakari-hostmonitor will recognize online
- ComputeNodes as offline if ComputeNodes are managed by pacemaker_remote
+ There are cases when masakari-hostmonitor will recognize online nodes as
+ offline and send (in)appropriate notifications to Masakari

I will make this a priority of mine for the next cycle.

description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers