Comment 4 for bug 1911909

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Looking through the logs for the glance units from one of the runs identified in comment #3, I have a strong suspicion that this is related to bug https://bugs.launchpad.net/charm-hacluster/+bug/1874719. A patch has been proposed and merged at https://review.opendev.org/c/openstack/charm-hacluster/+/834034 and is available in charm revision 93 in the latest/edge channel.

Can you please try using the latest/edge channel for focal+ deployments?

Supporting evidence below.

I see the following:

2022-04-03 18:04:03 ERROR unit.hacluster-glance/1.juju-log server.go:327 Pacemaker is down. Please manually start it. Pacemaker or Corosync are still not fully up after waiting for 12 retries. This looks like lp:1874719. Last output: node1(1): member

With pacemaker showing the following errors in the syslog:

Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemakerd[33227]: notice: Caught 'Terminated' signal
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemakerd[33227]: notice: Shutting down Pacemaker
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemakerd[33227]: notice: Stopping pacemaker-controld
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Caught 'Terminated' signal
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Shutting down cluster resource manager
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: State transition S_IDLE -> S_POLICY_ENGINE
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-attrd[33245]: notice: Setting shutdown[juju-f975de-0-lxd-4]: (unset) -> 1649009637
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: error: Resource start-up disabled since no STONITH resources have been defined
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: error: Either configure some or disable STONITH with the stonith-enabled option
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: error: NOTE: Clusters with shared data need STONITH to ensure data integrity
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: notice: Delaying fencing operations until there are resources to manage
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: notice: Scheduling shutdown of node juju-f975de-0-lxd-4
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: warning: Node node1 is unclean!
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: notice: * Shutdown juju-f975de-0-lxd-4
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: warning: Calculated transition 2 (with warnings), saving inputs in /var/lib/pacemaker/pengine/pe-warn-2.bz2
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-schedulerd[33246]: notice: Configuration errors found during scheduler processing, please run "crm_verify -L" to identify issues
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Transition 2 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-warn-2.bz2): Complete
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Disconnected from the executor
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Disconnected from Corosync
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemaker-controld[33247]: notice: Disconnected from the CIB manager
Apr 3 18:13:57 juju-f975de-0-lxd-4 pacemakerd[33227]: notice: Stopping pacemaker-schedulerd