Using different bindings for cluster communication and VIP breaks the cluster

Bug #2020669 reported by Tiago Pasqualini da Silva
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack HA Cluster Charm
Fix Committed
High
Tiago Pasqualini da Silva
Focal
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned

Bug Description

If we configure an application to use a space for its default binding and another one for the HA binding, corosync.conf gets mixed IP addresses. Here's how to reproduce:

1) my spaces (we'll use defaultspace for the applications and livemigration for HA):
$ juju spaces
Name Space ID Subnets
alpha 0
livemigration 1 192.168.101.0/24
osfloating 2 192.168.102.0/24
defaultspace 3 192.168.122.0/24

2) deploy any application binding it to the defaultspace and binding its HA relation to livemigration (we're using ceph-radosgw as an example, but this happens with any application):

series="focal"
ceph_channel="octopus/stable"
hacluster_channel="latest/edge"
ceph_vip="192.168.122.99"

juju model-config default-space="defaultspace"

juju deploy ceph-radosgw --series ${series} --channel ${ceph_channel} -n 3 --config vip=${ceph_vip} --to lxd:0,lxd:1,lxd:2 --bind="ha=livemigration"

juju deploy --config cluster_count=3 hacluster ceph-radosgw-hacluster --channel ${hacluster_channel} --bind="ha=livemigration hanode=livemigration"

juju add-relation ceph-radosgw-hacluster:ha ceph-radosgw:ha

3) check corosync.conf on any node, you'll see that the remote nodes get the correct IP, but the local node gets configured with the wrong IP.

This happens because, for the local node, it gets its 'private-address' attribute from Juju, without specifying a binding: https://github.com/openstack/charm-hacluster/blob/master/hooks/utils.py#L463

If we run the command manually, we can see that the binding is wrong:

$ juju run -u ceph-radosgw-hacluster/0 -- unit-get private-address
192.168.122.95

A quick way of fixing this would be to get its private address from the relation:

$ juju run -u ceph-radosgw-hacluster/0 -- relation-get -r hanode:1 - ceph-radosgw-hacluster/0
egress-subnets: 192.168.101.168/32
hostname: juju-32927a-2-lxd-0
ingress-address: 192.168.101.168
member_ready: "True"
private-address: 192.168.101.168
ready: "True"

Which returns the correct address.

Tags: sts
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-hacluster (master)
Changed in charm-hacluster:
status: New → In Progress
tags: added: sts
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (master)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/884197
Committed: https://opendev.org/openstack/charm-hacluster/commit/71100249ee236302ff09d90d49ec66c2fb846829
Submitter: "Zuul (22348)"
Branch: master

commit 71100249ee236302ff09d90d49ec66c2fb846829
Author: Tiago Pasqualini <email address hidden>
Date: Tue May 23 12:14:25 2023 -0300

    Get private-address for local unit from relation

    Currently, the private-address for the local unit is queried using
    unit_get, which can cause it to return an address from a different
    binding. This patch changes it to always query from the relation.

    Closes-bug: #2020669
    Change-Id: I128420c572d5491b9af4cf34614f4534c787d02c

Changed in charm-hacluster:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-hacluster (stable/jammy)

Fix proposed to branch: stable/jammy
Review: https://review.opendev.org/c/openstack/charm-hacluster/+/899181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-hacluster (stable/focal)

Fix proposed to branch: stable/focal
Review: https://review.opendev.org/c/openstack/charm-hacluster/+/899183

Changed in charm-hacluster:
assignee: nobody → Tiago Pasqualini da Silva (tiago.pasqualini)
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (stable/jammy)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/899181
Committed: https://opendev.org/openstack/charm-hacluster/commit/7ff0d9b1c0d2114c580fb9bac370d4bb9f2fc514
Submitter: "Zuul (22348)"
Branch: stable/jammy

commit 7ff0d9b1c0d2114c580fb9bac370d4bb9f2fc514
Author: Tiago Pasqualini <email address hidden>
Date: Tue May 23 12:14:25 2023 -0300

    Get private-address for local unit from relation

    Currently, the private-address for the local unit is queried using
    unit_get, which can cause it to return an address from a different
    binding. This patch changes it to always query from the relation.

    Closes-bug: #2020669
    Change-Id: I128420c572d5491b9af4cf34614f4534c787d02c
    (cherry picked from commit 71100249ee236302ff09d90d49ec66c2fb846829)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-hacluster (stable/focal)

Reviewed: https://review.opendev.org/c/openstack/charm-hacluster/+/899183
Committed: https://opendev.org/openstack/charm-hacluster/commit/cc36dea297b709d05408dceef733e486f6f82f82
Submitter: "Zuul (22348)"
Branch: stable/focal

commit cc36dea297b709d05408dceef733e486f6f82f82
Author: Tiago Pasqualini <email address hidden>
Date: Tue May 23 12:14:25 2023 -0300

    Get private-address for local unit from relation

    Currently, the private-address for the local unit is queried using
    unit_get, which can cause it to return an address from a different
    binding. This patch changes it to always query from the relation.

    Depends-on: I9a9efdb5c7e5d3db6dbad11413782b6e07a335c4
    Closes-bug: #2020669
    Change-Id: I128420c572d5491b9af4cf34614f4534c787d02c
    (cherry picked from commit 71100249ee236302ff09d90d49ec66c2fb846829)
    (cherry picked from commit 7ff0d9b1c0d2114c580fb9bac370d4bb9f2fc514)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.