reducing number of nodes related in hacluster does not remove the nodes from corosync/pacemaker
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack HA Cluster Charm |
Triaged
|
High
|
Unassigned |
Bug Description
There seems to be an issue where removing units from a clustered application does not remove the leaving units from corosync/pacemaker.
I had three machines:
juju-32871d-
juju-32871d-0-lxd-3
juju-32871d-1-lxd-4
I added 3 more (to migrate from lxd to kvm):
glance-
glance-
glance-
The cluster upticked properly to see all 6 nodes.
I then removed 2 nodes and they have left juju cleanly, but they still remain in crm config show and /etc/corosync/
To reproduce:
juju deploy <some ha-consuming service> --num-units 3
juju deploy hacluster
juju config hacluster cluster_units=3
juju add-relation <haservice>
juju-wait
check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-
juju add-unit <ha-consuming service>
juju add-unit <ha-consuming service>
juju add-unit <ha-consuming service>
check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-
juju remove-unit <ha-consuming service>/0
juju remove-unit <ha-consuming service>/1
check and dump juju ssh hacluster 'sudo crm status; sudo crm config show; sudo corosync-
Charms on this site are openstack-
tags: | added: scaleback |
Changed in charm-hacluster: | |
importance: | Undecided → High |
Changed in charm-hacluster: | |
milestone: | none → 19.07 |
Changed in charm-hacluster: | |
status: | New → Triaged |
Changed in charm-hacluster: | |
milestone: | 19.07 → 19.10 |
Part of the workaround is on remaining nodes,
sudo crm_node -l |grep lost
for each lost node:
sudo crm_node -R <dead node>
This doesn't clean up the corosync.conf file, though.
It seems the charm doesn't handle hacluster- relation- departed hooks at all.