After rebooting an entire fcb cluster (shutdown -r on all nodes), my rabbitmq cluster failed to come back up.
rabbitmqctl cluster_status:
http://paste.ubuntu.com/p/hh4GV2BJ8R/
juju status for rabbitmq-server:
http://paste.ubuntu.com/p/ptrJSrHGkG/
bundle:
http://paste.ubuntu.com/p/k35TTVp3Ps/
Reproducer 1 (tested on charm rev 102):
Results in:
Unit Workload Agent Machine Public address Ports Message
rabbitmq-server/2 waiting idle 2 10.5.0.13 5672/tcp Unit has peers, but RabbitMQ not clustered
rabbitmq-server/3 error idle 3 10.5.0.4 5672/tcp hook failed: "cluster-relation-changed"
rabbitmq-server/4* error idle 4 10.5.0.20 5672/tcp hook failed: "update-status"
Howto:
juju deploy -n 3 --config min-cluster-size=3 rabbitmq-server
juju wait (may need snap install juju-wait first)
openstack server stop juju-98eb54-default-4 juju-98eb54-default-3 juju-98eb54-default-2
openstack server start juju-98eb54-default-4; sleep 150; openstack server start juju-98eb54-default-3; sleep 150; openstack server start juju-98eb54-default-2
As mentioned in comment there maybe multiple timings that can cause this failure.
I cannot reproduce this. I have documented the steps I took to try to reproduce this in the attached log