Comment 0 for bug 1710247

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

When upgrading rabbitmq-server from 16.07 to 16.10 I'm getting an error wait_app()

Due to the changes introduced in Change-Id: I105eb2684e61a553a52c5a944e8c562945e2a6eb (cf. Bug #1584902) the nodename of a rabbitmq node is expected to equal socket.gethostname().

However, units reverse reso resolves to another name, and the in cluster they're known by that name.

$ juju run --unit rabbitmq-server/3 'hostname ; unit-get private-address ; dig +short -x $( unit-get private-address )'
...
juju-machine-1-lxc-14
10.76.12.252
10-76-12-252.maas.

$ u=rabbitmq-server/3;r=cluster; juju run --unit $u "relation-ids $r| xargs -I_@ sh -c 'relation-list -r _@|xargs -I_U sh -c \"relation-get -r _@ - _U |sed s,^,_U:, 2>&1\"'" | grep clustered
rabbitmq-server/4:clustered: 10-76-12-236
rabbitmq-server/5:clustered: 10-76-12-245

When running upgrade-charm the wait_app func expects the pid file in the wrong place b/c of this:

Reading package lists...
Waiting for 'rabbit@10-76-12-252' ...
pid is 13134 ...
Error: process_not_running
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/upgrade-charm", line 709, in <module>
    rabbit.assess_status(rabbit.ConfigRenderer(rabbit.CONFIG_FILES))
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/rabbit_utils.py", line 809, in assess_status
    assess_status_func(configs)()
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/rabbit_utils.py", line 833, in _assess_status_func
    services=services(), ports=None)
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1178, in _determine_os_workload_status
    state, message, lambda: charm_func(configs))
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1306, in _ows_check_charm_func
    charm_state, charm_message = charm_func_with_configs()
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1178, in <lambda>
    state, message, lambda: charm_func(configs))
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/rabbit_utils.py", line 744, in assess_cluster_status
    ret = wait_app()
  File "/var/lib/juju/agents/unit-rabbitmq-server-3/charm/hooks/rabbit_utils.py", line 361, in wait_app
    raise ex
subprocess.CalledProcessError: Command '['timeout', '180', '/usr/sbin/rabbitmqctl', 'wait', '/<email address hidden>']' returned non-zero exit status 2
2017-08-11 10:54:39 ERROR juju.worker.uniter.operation runhook.go:107 hook "upgrade-charm" failed: exit status 1

Other functions that depend on the clustername to equal socket.gethostname() will likely fail too, eg. is_leader()

Juju: 1.25.10