unable to connect to node during config-changed hook

Bug #1671017 reported by Francis Ginther
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Juju Charms Collection)
New
Undecided
Unassigned

Bug Description

Found a case of rabbitmqctl failing during a config-change hook with messages indicating that it was unable to connect to the rabbitmq node. This happened shortly after the rabbitmq-server package was installed:

[from landscape-0-inner-logs/rabbitmq-server-0/var/log/juju/unit-rabbitmq-server-0.log]
2017-03-08 06:30:49 INFO config-changed Setting up lockfile-progs (0.1.17) ...
2017-03-08 06:30:49 INFO config-changed Setting up python-amqplib (1.0.2-1) ...
2017-03-08 06:30:49 INFO config-changed Setting up rabbitmq-server (3.5.7-1) ...
2017-03-08 06:30:49 INFO config-changed
2017-03-08 06:30:49 INFO config-changed Configuration file '/etc/default/rabbitmq-server'
2017-03-08 06:30:49 INFO config-changed ==> File on system created by you or by a script.
2017-03-08 06:30:49 INFO config-changed ==> File also in package provided by package maintainer.
2017-03-08 06:30:49 INFO config-changed ==> Using current old file as you requested.
2017-03-08 06:30:49 INFO config-changed Adding group `rabbitmq' (GID 117) ...
2017-03-08 06:30:49 INFO config-changed Done.
2017-03-08 06:30:49 INFO config-changed Adding system user `rabbitmq' (UID 113) ...
2017-03-08 06:30:49 INFO config-changed Adding new user `rabbitmq' (UID 113) with group `rabbitmq' ...
2017-03-08 06:30:49 INFO config-changed Not creating home directory `/var/lib/rabbitmq'.
2017-03-08 06:30:58 INFO config-changed Processing triggers for libc-bin (2.23-0ubuntu5) ...
2017-03-08 06:30:58 INFO config-changed Processing triggers for systemd (229-4ubuntu16) ...
2017-03-08 06:30:58 INFO config-changed Processing triggers for ureadahead (0.100.0-19) ...
2017-03-08 06:31:01 INFO juju-log Changing ownership of path /var/lib/rabbitmq to rabbitmq:rabbitmq
2017-03-08 06:31:01 INFO juju-log Changing perms of path /var/lib/rabbitmq
2017-03-08 06:31:03 INFO config-changed Plugin configuration unchanged.
2017-03-08 06:31:03 INFO config-changed
2017-03-08 06:31:03 INFO config-changed Applying plugin configuration to rabbit@juju-f3ee7c-2-lxd-2... failed.
2017-03-08 06:31:03 INFO config-changed * Could not contact node rabbit@juju-f3ee7c-2-lxd-2.
2017-03-08 06:31:03 INFO config-changed Changes will take effect at broker restart.
2017-03-08 06:31:03 INFO config-changed * Options: --online - fail if broker cannot be contacted.
2017-03-08 06:31:03 INFO config-changed --offline - do not try to contact broker.
2017-03-08 06:31:06 DEBUG juju-log writing config file: /etc/rabbitmq/rabbitmq.config , data: {'ssl_mode': 'off', 'cluster_partition_handling': u'ignore'}
2017-03-08 06:31:06 INFO juju-log Writing file /etc/rabbitmq/rabbitmq.config root:root 644
2017-03-08 06:31:10 INFO config-changed Error: unable to connect to node 'rabbit@juju-f3ee7c-2-lxd-2': nodedown
2017-03-08 06:31:10 INFO config-changed
2017-03-08 06:31:10 INFO config-changed DIAGNOSTICS
2017-03-08 06:31:10 INFO config-changed ===========
2017-03-08 06:31:10 INFO config-changed
2017-03-08 06:31:10 INFO config-changed attempted to contact: ['rabbit@juju-f3ee7c-2-lxd-2']
2017-03-08 06:31:10 INFO config-changed
2017-03-08 06:31:10 INFO config-changed rabbit@juju-f3ee7c-2-lxd-2:
2017-03-08 06:31:10 INFO config-changed * connected to epmd (port 4369) on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:10 INFO config-changed * epmd reports: node 'rabbit' not running at all
2017-03-08 06:31:10 INFO config-changed no other nodes on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:10 INFO config-changed * suggestion: start the node
2017-03-08 06:31:10 INFO config-changed
2017-03-08 06:31:10 INFO config-changed current node details:
2017-03-08 06:31:10 INFO config-changed - node name: 'rabbitmq-cli-11918@juju-f3ee7c-2-lxd-2'
2017-03-08 06:31:10 INFO config-changed - home dir: /var/lib/rabbitmq
2017-03-08 06:31:10 INFO config-changed - cookie hash: kT5pUTetTi3ERYUzRaK7WQ==
2017-03-08 06:31:10 INFO config-changed
2017-03-08 06:31:11 DEBUG juju-log Command '['/usr/sbin/rabbitmqctl', 'list_vhosts']' returned non-zero exit status 2
2017-03-08 06:31:11 INFO juju-log /usr/bin/rsync -r --delete --executability /var/lib/juju/agents/unit-rabbitmq-server-0/charm/scripts/collect_rabbitmq_stats.sh /usr/local/bin/collect_rabbitmq_stats.sh
2017-03-08 06:31:11 INFO juju-log Writing file /etc/cron.d/rabbitmq-stats root:root 444
2017-03-08 06:31:11 INFO juju-log Making dir /var/lib/charm/rabbitmq-server rabbitmq:rabbitmq 775
2017-03-08 06:31:11 INFO juju-log Writing file /var/lib/charm/rabbitmq-server/nagios-rabbitmq-server-0.local_passwd rabbitmq:rabbitmq 660
2017-03-08 06:31:12 INFO config-changed Error: unable to connect to node 'rabbit@juju-f3ee7c-2-lxd-2': nodedown
2017-03-08 06:31:12 INFO config-changed
2017-03-08 06:31:12 INFO config-changed DIAGNOSTICS
2017-03-08 06:31:12 INFO config-changed ===========
2017-03-08 06:31:12 INFO config-changed
2017-03-08 06:31:12 INFO config-changed attempted to contact: ['rabbit@juju-f3ee7c-2-lxd-2']
2017-03-08 06:31:12 INFO config-changed
2017-03-08 06:31:12 INFO config-changed rabbit@juju-f3ee7c-2-lxd-2:
2017-03-08 06:31:12 INFO config-changed * connected to epmd (port 4369) on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:12 INFO config-changed * epmd reports: node 'rabbit' not running at all
2017-03-08 06:31:12 INFO config-changed no other nodes on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:12 INFO config-changed * suggestion: start the node
2017-03-08 06:31:12 INFO config-changed
2017-03-08 06:31:12 INFO config-changed current node details:
2017-03-08 06:31:12 INFO config-changed - node name: 'rabbitmq-cli-12067@juju-f3ee7c-2-lxd-2'
2017-03-08 06:31:12 INFO config-changed - home dir: /var/lib/rabbitmq
2017-03-08 06:31:12 INFO config-changed - cookie hash: kT5pUTetTi3ERYUzRaK7WQ==
2017-03-08 06:31:12 INFO config-changed
2017-03-08 06:31:12 DEBUG juju-log Command '['/usr/sbin/rabbitmqctl', 'list_vhosts']' returned non-zero exit status 2
2017-03-08 06:31:13 DEBUG juju-log Running ['/usr/sbin/rabbitmqctl', 'add_vhost', 'nagios-rabbitmq-server-0']
2017-03-08 06:31:14 INFO config-changed Error: unable to connect to node 'rabbit@juju-f3ee7c-2-lxd-2': nodedown
2017-03-08 06:31:14 INFO config-changed
2017-03-08 06:31:14 INFO config-changed DIAGNOSTICS
2017-03-08 06:31:14 INFO config-changed ===========
2017-03-08 06:31:14 INFO config-changed
2017-03-08 06:31:14 INFO config-changed attempted to contact: ['rabbit@juju-f3ee7c-2-lxd-2']
2017-03-08 06:31:14 INFO config-changed
2017-03-08 06:31:14 INFO config-changed rabbit@juju-f3ee7c-2-lxd-2:
2017-03-08 06:31:14 INFO config-changed * connected to epmd (port 4369) on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:14 INFO config-changed * epmd reports: node 'rabbit' not running at all
2017-03-08 06:31:14 INFO config-changed no other nodes on juju-f3ee7c-2-lxd-2
2017-03-08 06:31:14 INFO config-changed * suggestion: start the node
2017-03-08 06:31:14 INFO config-changed
2017-03-08 06:31:14 INFO config-changed current node details:
2017-03-08 06:31:14 INFO config-changed - node name: 'rabbitmq-cli-12179@juju-f3ee7c-2-lxd-2'
2017-03-08 06:31:14 INFO config-changed - home dir: /var/lib/rabbitmq
2017-03-08 06:31:14 INFO config-changed - cookie hash: kT5pUTetTi3ERYUzRaK7WQ==
2017-03-08 06:31:14 INFO config-changed
2017-03-08 06:31:14 INFO config-changed Traceback (most recent call last):
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/config-changed", line 672, in <module>
2017-03-08 06:31:14 INFO config-changed hooks.execute(sys.argv)
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/charmhelpers/core/hookenv.py", line 731, in execute
2017-03-08 06:31:14 INFO config-changed self._hooks[hook_name]()
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 788, in wrapped_f
2017-03-08 06:31:14 INFO config-changed f(*args, **kwargs)
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 79, in _harden_inner2
2017-03-08 06:31:14 INFO config-changed return f(*args, **kwargs)
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/config-changed", line 627, in config_changed
2017-03-08 06:31:14 INFO config-changed update_nrpe_checks()
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/config-changed", line 511, in update_nrpe_checks
2017-03-08 06:31:14 INFO config-changed rabbit.create_vhost(vhost)
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 198, in create_vhost
2017-03-08 06:31:14 INFO config-changed rabbitmqctl('add_vhost', vhost)
2017-03-08 06:31:14 INFO config-changed File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/rabbit_utils.py", line 346, in rabbitmqctl
2017-03-08 06:31:14 INFO config-changed subprocess.check_call(cmd)
2017-03-08 06:31:14 INFO config-changed File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
2017-03-08 06:31:14 INFO config-changed raise CalledProcessError(retcode, cmd)
2017-03-08 06:31:14 INFO config-changed subprocess.CalledProcessError: Command '['/usr/sbin/rabbitmqctl', 'add_vhost', 'nagios-rabbitmq-server-0']' returned non-zero exit status 2
2017-03-08 06:31:14 ERROR juju.worker.uniter.operation runhook.go:107 hook "config-changed" failed: exit status 1

The dump collected 'ps fauxww' output under "landscape-0-inner-logs/rabbitmq-server-0/var/log/ps-fauxww.txt". It does show a beam.smp process running since 06:30:

rabbitmq 11533 0.0 0.3 2186612 52312 ? Sl 06:30 0:01 \_ /usr/lib/erlang/erts-7.3/bin/beam.smp -W w -A 64 -P 1048576 -K true -B i -- -root /usr/lib/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.5.7/sbin/../ebin -noshell -noinput -s rabbit boot -sname rabbit@juju-f3ee7c-2-lxd-2 -boot start_sasl -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/<email address hidden>"} -rabbit sasl_error_logger {file,"/<email address hidden>"} -rabbit enabled_plugins_file "/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/lib/rabbitmq/lib/rabbitmq_server-3.5.7/sbin/../plugins" -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/rabbit@juju-f3ee7c-2-lxd-2-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/rabbit@juju-f3ee7c-2-lxd-2" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

This was found during a Landscape Autopilot openstack deployment in our CI [1], using the following versions and charm config. All logs collected from the run are attached.

Landscape: 17.01~bzr10962+jenkins3541-2
Juju: 1:2.1.1-0ubuntu1~16.04.1~juju1
Maas: 2.1.3+bzr5573-0ubuntu1
rabbitmq-server charm: cs:xenial/rabbitmq-server-61
Charm configuration (non defaults):
rabbitmq-server
 - min-cluster-size: 3

[1] - https://ci.lscape.net/job/landscape-system-tests/5461/

Tags: landscape
Revision history for this message
Francis Ginther (fginther) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.