OpenStack HA Cluster Charm

reducing number of nodes related in hacluster does not remove the nodes from corosync/pacemaker

Bug #1821109 reported by Drew Freiberger on 2019-03-21

This bug report is a duplicate of: Bug #1400481: Removing unit from hacluster doesn't properly remove node from corosync. Edit Remove

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack HA Cluster Charm	Triaged	High	Unassigned	OpenStack HA Cluster Charm 19.10

Bug Description

There seems to be an issue where removing units from a clustered application does not remove the leaving units from corosync/pacemaker.

I had three machines:
juju-32871d-12-lxd-4
juju-32871d-0-lxd-3
juju-32871d-1-lxd-4

I added 3 more (to migrate from lxd to kvm):

glance-purestorage-1
glance-purestorage-2
glance-purestorage-3

The cluster upticked properly to see all 6 nodes.

I then removed 2 nodes and they have left juju cleanly, but they still remain in crm config show and /etc/corosync/corosync.conf.

To reproduce:
juju deploy <some ha-consuming service> --num-units 3
juju deploy hacluster
juju config hacluster cluster_units=3
juju add-relation <haservice>:hacluster hacluster
juju-wait
check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-quorumtool'

juju add-unit <ha-consuming service>
juju add-unit <ha-consuming service>
juju add-unit <ha-consuming service>

check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-quorumtool'

juju remove-unit <ha-consuming service>/0
juju remove-unit <ha-consuming service>/1

check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-quorumtool'

Charms on this site are openstack-origin=cloud:xenial-pike and running cs:glance-275 and cs:hacluster-49

Tags:

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2019-03-21:

Part of the workaround is on remaining nodes,

sudo crm_node -l |grep lost

for each lost node:

sudo crm_node -R <dead node>

This doesn't clean up the corosync.conf file, though.

It seems the charm doesn't handle hacluster-relation-departed hooks at all.

Ryan Beisner (1chb1n) on 2019-03-21

tags:

added: scaleback

Ryan Beisner (1chb1n) on 2019-03-21

Changed in charm-hacluster:
importance:	Undecided → High

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2019-03-21:

After running the crm_node -R, you can then 'juju run --application <hacluster> hooks/config-changed' to trigger an update to corosync.conf file and running corosync status

To see status of cluster quorum, check corosync-quorumtool output on all hosts for Total votes: and Quorum: counts before removing additional units that would drop below quorum.

Ryan Beisner (1chb1n) on 2019-05-04

Changed in charm-hacluster:
milestone:	none → 19.07

Chris MacNaughton (chris.macnaughton) on 2019-05-13

Changed in charm-hacluster:
status:	New → Triaged

David Ames (thedac) on 2019-08-12

Changed in charm-hacluster:
milestone:	19.07 → 19.10

Revision history for this message

Trent Lloyd (lathiat) wrote on 2019-10-18:

Seems like this is a duplicate of this bug, can we confirm?
https://bugs.launchpad.net/charm-hacluster/+bug/1400481

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2019-10-18:

Confirmed, this is a duplicate of bug 1400481. This bug does list workaround information missing from that bug that may be useful for future travelers.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1400481 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.