Charm update-status fails after scaling in operation

Bug #1839683 reported by Eduardo Sousa
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack RabbitMQ Server Charm
Triaged
Medium
Unassigned

Bug Description

I installed Openstack using openstack-base with openstack-base-virt-overlay.yaml (https://jujucharms.com/openstack-base/). I was testing scaling units, when I found a bug in rabbitmq-server.

I scaled rabbitmq-server charm out and in two times.
At the end of the second scale in operation, rabbitmq-server/0 stayed in hook failed: "update-status".

Steps
-----
# Scale out and wait for it to finish
juju add-unit -n3 rabbitmq-server
watch -n1 -c juju status rabbitmq-server --color

# Scale in and wait for it to finish
juju remove-unit rabbitmq-server/1 rabbitmq-server/2 rabbitmq-server/3
watch -n1 -c juju status rabbitmq-server --color

# Scale out and wait for it to finish
juju add-unit -n3 rabbitmq-server
watch -n1 -c juju status rabbitmq-server --color

# Scale in and wait for it to finish
juju remove-unit rabbitmq-server/4 rabbitmq-server/5 rabbitmq-server/6
watch -n1 -c juju status rabbitmq-server --color
-----

Note: I waited for all units to reach a stable state between executing commands.

Once it finishes, juju status shows the following:
    rabbitmq-server/0* error idle 18 10.5.0.29 5672/tcp hook failed: "update-status"

From logs
---------
INFO juju-log Updating status.
DEBUG juju-log check_cluster_memberships(): 'rabbit@juju-8d2ee4-edsousa-25' in nodes but not in charm relations or running_nodes, telling RabbitMQ to forget about it.
DEBUG juju-log Running ['/usr/sbin/rabbitmqctl', 'forget_cluster_node', 'rabbit@juju-8d2ee4-edsousa-25']
DEBUG update-status Removing node 'rabbit@juju-8d2ee4-edsousa-25' from cluster
DEBUG update-status Error: {not_a_cluster_node,"The node selected is not in the cluster."}
DEBUG update-status Traceback (most recent call last):
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/update-status", line 972, in <module>
DEBUG update-status hooks.execute(sys.argv)
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/charmhelpers/core/hookenv.py", line 914, in execute
DEBUG update-status self._hooks[hook_name]()
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/charmhelpers/contrib/hardening/harden.py", line 93, in _harden_inner2
DEBUG update-status return f(*args, **kwargs)
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/update-status", line 968, in update_status
DEBUG update-status rabbit.check_cluster_memberships()
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 554, in check_cluster_memberships
DEBUG update-status forget_cluster_node(node)
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 564, in forget_cluster_node
DEBUG update-status rabbitmqctl('forget_cluster_node', node)
DEBUG update-status File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 376, in rabbitmqctl
DEBUG update-status subprocess.check_call(cmd)
DEBUG update-status File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
DEBUG update-status raise CalledProcessError(retcode, cmd)
DEBUG update-status subprocess.CalledProcessError: Command '['/usr/sbin/rabbitmqctl', 'forget_cluster_node', 'rabbit@juju-8d2ee4-edsousa-25']' returned non-zero exit status 70.
ERROR juju.worker.uniter.operation runhook.go:132 hook "update-status" failed: exit status 1
---------

Revision history for this message
Eduardo Sousa (edsousa) wrote :
Revision history for this message
Eduardo Sousa (edsousa) wrote :
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

This may be an issue with documentation or other; it's not clear whether the configuration options of the charm were changed so that the rabbitmq-server expected the to reduce the cluster size below the minimum (min-cluster-size?). Rabbitmq also doesn't really 'like' even cluster sizes -- it's not clear but it seemed to be '6'. Also for rabbitmq reasonable cluster sizes are 3.

Having said that, there was a hook-error, and charms should not error out; they should report the failure in the log and status line as appropriate.

Changed in charm-rabbitmq-server:
importance: Undecided → Medium
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.