Comment 12 for bug 1818260

Revision history for this message
David Ames (thedac) wrote :

The fix [0] for a 2017 bug [1] is colliding during deployment. The leader node is running update-status simultaneously as a non-leader node is joining the cluster. Such that the check_cluster_membership thinks the non-leader is a dead node and attempts to remove it.

# Leader juju-828573-15-lxd-10 Running update-status hook
2020-05-29 16:44:27 DEBUG juju-log check_cluster_memberships(): 'rabbit@juju-828573-11-lxd-19' in nodes but not in charm relations or running_nodes, telling RabbitMQ to forget about it.

# Non-leader juju-828573-11-lxd-19 joining cluster
2020-05-29 16:44:15 DEBUG juju-log cluster:40: Running ['/usr/sbin/rabbitmqctl', 'stop_app']
2020-05-29 16:44:18 DEBUG cluster-relation-changed Stopping rabbit application on node 'rabbit@juju-828573-11-lxd-19'
2020-05-29 16:44:23 DEBUG juju-log cluster:40: Running ['/usr/sbin/rabbitmqctl', 'start_app']
2020-05-29 16:44:28 DEBUG cluster-relation-changed Starting node 'rabbit@juju-828573-11-lxd-19'
2020-05-29 16:44:30 DEBUG juju-log cluster:40: Waiting for rabbitmq app to start: /<email address hidden>
2020-05-29 16:44:30 DEBUG juju-log cluster:40: Running ['timeout', '180', '/usr/sbin/rabbitmqctl', 'wait', '/<email address hidden>']
2020-05-29 16:44:35 DEBUG cluster-relation-changed Waiting for 'rabbit@juju-828573-11-lxd-19'
2020-05-29 16:44:35 DEBUG cluster-relation-changed pid is 29035
2020-05-29 16:44:35 DEBUG juju-log cluster:40: Confirmed rabbitmq app is running
2020-05-29 16:44:35 INFO juju-log cluster:40: Host clustered with rabbit@juju-828573-15-lxd-10.

[0] https://github.com/openstack/charm-rabbitmq-server/commit/08b10513c5725fb740382668c47fc769a6f2936c#diff-21870f9ed3cd89ae3ca8d1f237afc315R467
[1] https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1679449