Clear configured and clustered flags for removed instances

Bug #1922394 reported by David Ames
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL InnoDB Cluster Charm
Fix Released
High
David Ames

Bug Description

When an instance is removed from the cluster either by departing gracefully due to a juju remove-unit or when running the remove-instance action the cluster-instance-configured and cluster-instance-clustered flags need to be cleared.

juju run --unit mysql-innodb-cluster/4 leader-get
cluster-created: 75e3655d-1632-4bbb-bbe8-7606b67fd932
cluster-instance-clustered-10.5.0.23: "True"
cluster-instance-clustered-10.5.0.25: "True"
cluster-instance-clustered-10.5.0.35: "True"
cluster-instance-clustered-10.5.0.39: "True"
cluster-instance-clustered-10.5.0.48: "True"
cluster-instance-configured-10.5.0.23: "True"
cluster-instance-configured-10.5.0.25: "True"
cluster-instance-configured-10.5.0.35: "True"
cluster-instance-configured-10.5.0.39: "True"
cluster-instance-configured-10.5.0.48: "True"

In the above example 10.5.0.48 had been removed but the flags remained.

The problem arises when a new instance is created and it accidentally gets the same IP address (10.5.0.48 in the example). This will lead to the new instance (with the same IP address) never properly joining the cluster and stuck in the following state. See test artifacts from [0].

blocked Cluster is inaccessible from this instance. Please check logs for details.

The workaround is to clear the flags:
(in debug hooks or from juju run)
charms.reactive clear_flag cluster-instance-configured-10.5.0.48
charms.reactive clear_flag cluster-instance-clustered-10.5.0.48

[0] https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_func_full/openstack/charm-mysql-innodb-cluster/779427/13/8329/index.html

David Ames (thedac)
Changed in charm-mysql-innodb-cluster:
assignee: nobody → David Ames (thedac)
importance: Undecided → High
status: New → Triaged
Revision history for this message
David Ames (thedac) wrote :

Secondary issue:

It turned out clearing the flag turned out to be more difficult than anticipated. After much trial and error figured out that the issue is leader set with a key with dots in it.

You should be able to do the following:

# leader-set key=value
# leader-get
key=value

# leader-set key=
# leader-get
{}

However when a dot is in the key name this does not work:

# leader-set cluster-instance-clustered-10.5.0.10=
# leader-set key.with.dots="True"
# leader-get
cluster-instance-clustered-10.5.0.10: "True"
key.with.dots: "True"

# leader-set key.with.dots=
# leader-get
cluster-instance-clustered-10.5.0.10: "True"
cluster-with.dots: "True"

TRIAGE:

Change flags from IP address to IP separated by dashes:
       leadership.leader_set({
            "cluster-instance-clustered-{}"
            .format(self.cluster_address.replace(".", "-")): True})

The next step is to translate this into the leader layer settings.

The remove-instance action should clear flags and bonus points for having the leader remove instance when in a departing hook.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-innodb-cluster (master)
Changed in charm-mysql-innodb-cluster:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-innodb-cluster (master)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-innodb-cluster/+/786514
Committed: https://opendev.org/openstack/charm-mysql-innodb-cluster/commit/f22ca3b5b4dde7f92edb3a9b1e17835555590d1a
Submitter: "Zuul (22348)"
Branch: master

commit f22ca3b5b4dde7f92edb3a9b1e17835555590d1a
Author: David Ames <email address hidden>
Date: Wed Apr 14 11:42:18 2021 -0700

    Remove instance flags when instance removed

    Previously when an instance was removed the leadership settings and
    charms.reactive flags remained for that instance's IP address. If a new
    instance was subsequently added and happened to have the same IP address
    the charm would never add the new instance to the cluster because it
    believed the instance was already configured and clustered based on
    leader settings.

    Clear leader settings flags for instance cluster configured and
    clustered.

    Due to a bug in Juju the previous use of IP addresses with '.' were
    unable to be unset. Transform dotted flags to use '-' instead.

    func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/565

    Change-Id: If3ffa9e9191c057ac7e3d96bfcf84d8a3a2ad45a
    Closes-Bug: #1922394
    Related-Bug: #1889792

Changed in charm-mysql-innodb-cluster:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-innodb-cluster (stable/21.04)
Changed in charm-mysql-innodb-cluster:
milestone: none → 21.04
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-innodb-cluster (stable/21.04)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-innodb-cluster/+/799953
Committed: https://opendev.org/openstack/charm-mysql-innodb-cluster/commit/6d9a50774c4a1c7875bd2d90fe66b5cb5dafe63a
Submitter: "Zuul (22348)"
Branch: stable/21.04

commit 6d9a50774c4a1c7875bd2d90fe66b5cb5dafe63a
Author: David Ames <email address hidden>
Date: Wed Apr 14 11:42:18 2021 -0700

    Remove instance flags when instance removed

    Previously when an instance was removed the leadership settings and
    charms.reactive flags remained for that instance's IP address. If a new
    instance was subsequently added and happened to have the same IP address
    the charm would never add the new instance to the cluster because it
    believed the instance was already configured and clustered based on
    leader settings.

    Clear leader settings flags for instance cluster configured and
    clustered.

    Due to a bug in Juju the previous use of IP addresses with '.' were
    unable to be unset. Transform dotted flags to use '-' instead.

    Change-Id: If3ffa9e9191c057ac7e3d96bfcf84d8a3a2ad45a
    Closes-Bug: #1922394
    Related-Bug: #1889792
    (cherry picked from commit f22ca3b5b4dde7f92edb3a9b1e17835555590d1a)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.