Validate corosync cluster member IPs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack HA Cluster Charm |
Won't Fix
|
Wishlist
|
Aurelien Lourot |
Bug Description
A prerequisite for the correct functionality of the cluster is obviously for all cluster members to be able to reach each other and obtain quorum. If the hanode endpoint is bound correctly, this is generally not an issue, as all the IPs received over the peer relation would lie in the same subnet. If the operator does not however set an explicit binding for the hanode endpoint, it can happen that the hacluster charm will write a corosync configuration in which peers have IPs in different subnets. This is particularly problematic in case docker or lxd bridges are present on the cluster member machines, as these IPs are not reachable from other machines at all.
In order to provide a more robust experience, the hacluster charm should go in a blocked state if not all the cluster member IPs are in the same subnet.
Changed in charm-hacluster: | |
status: | New → Triaged |
importance: | Undecided → Wishlist |
Changed in charm-hacluster: | |
status: | In Progress → Won't Fix |
For reference, this is the kind of situation that should be avoided:
$ juju run -u hacluster-neutron/3 -- relation-get -r24 - hacluster-neutron/5
egress-subnets: 172.17.0.1/32
ingress-address: 172.17.0.1
member_ready: "True"
private-address: 172.17.0.1
ready: "True"
$ juju run -u hacluster-neutron/5 -- relation-get -r24 - hacluster-neutron/3
egress-subnets: 100.100.185.70/32
ingress-address: 100.100.185.70
member_ready: "True"
private-address: 100.100.185.70
ready: "True"
$ juju run -u hacluster-neutron/5 -- relation-get -r24 - hacluster-neutron/4
egress-subnets: 100.100.185.123/32
ingress-address: 100.100.185.123
member_ready: "True"
private-address: 100.100.185.123
ready: "True"