rabbitmq start_app failed during cluster-relation-changed

Bug #1587631 reported by Francis Ginther
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack RabbitMQ Server Charm
Expired
Low
Unassigned
rabbitmq-server (Juju Charms Collection)
Invalid
Undecided
Unassigned

Bug Description

cs:trusty/rabbitmq-server-44

This was found during an automated system-test of Landscape Openstack Autopilot, https://ci.lscape.net/job/landscape-system-tests/2459/ (OBJECT=ceph, BLOCK=ceph, LDS=16.05~bzr10071~release43+jenkins131-1).

The Openstack deployment failed due to an error in the rabbitmq cluster-relation-changed hook.

[From /var/log/juju/unit-rabbitmq-server-1.log (see http://paste.ubuntu.com/16874671/ for better formatting)]
2016-05-19 06:01:47 DEBUG juju-log cluster:20: Running ['/usr/sbin/rabbitmqctl', 'start_app']
2016-05-19 06:01:47 INFO cluster-relation-changed Starting node 'rabbit@10-96-9-0' ...
2016-05-19 06:01:58 INFO cluster-relation-changed
2016-05-19 06:01:58 INFO cluster-relation-changed
2016-05-19 06:01:58 INFO cluster-relation-changed BOOT FAILED
2016-05-19 06:01:58 INFO cluster-relation-changed ===========
2016-05-19 06:01:58 INFO cluster-relation-changed
2016-05-19 06:01:58 INFO cluster-relation-changed Error description:
2016-05-19 06:01:58 INFO cluster-relation-changed {could_not_start,mnesia,
2016-05-19 06:01:58 INFO cluster-relation-changed {{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}
},
2016-05-19 06:01:58 INFO cluster-relation-changed {mnesia_sup,start,[normal,[]]}}}
2016-05-19 06:01:58 INFO cluster-relation-changed
2016-05-19 06:01:58 INFO cluster-relation-changed Log files (may contain more information):
2016-05-19 06:01:58 INFO cluster-relation-changed /var/log/rabbitmq/rabbit@10-96-9-0.log
2016-05-19 06:01:58 INFO cluster-relation-changed /var/log/rabbitmq/rabbit@10-96-9-0-sasl.log
2016-05-19 06:01:58 INFO cluster-relation-changed
2016-05-19 06:01:59 INFO cluster-relation-changed Error: {rabbit,failure_during_boot,
2016-05-19 06:01:59 INFO cluster-relation-changed {could_not_start,mnesia,
2016-05-19 06:01:59 INFO cluster-relation-changed {{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}},
2016-05-19 06:01:59 INFO cluster-relation-changed {mnesia_sup,start,[normal,[]]}}}}
2016-05-19 06:01:59 INFO cluster-relation-changed Traceback (most recent call last):
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/cluster-relation-changed", line 751, in <module>
2016-05-19 06:01:59 INFO cluster-relation-changed hooks.execute(sys.argv)
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/charmhelpers/core/hookenv.py", line 717, in execute
2016-05-19 06:01:59 INFO cluster-relation-changed self._hooks[hook_name]()
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/cluster-relation-changed", line 383, in cluster_changed
2016-05-19 06:01:59 INFO cluster-relation-changed rabbit.cluster_with()
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/rabbit_utils.py", line 389, in cluster_with
2016-05-19 06:01:59 INFO cluster-relation-changed start_app()
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/rabbit_utils.py", line 338, in start_app
2016-05-19 06:01:59 INFO cluster-relation-changed rabbitmqctl('start_app')
2016-05-19 06:01:59 INFO cluster-relation-changed File "/var/lib/juju/agents/unit-rabbitmq-server-1/charm/hooks/rabbit_utils.py", line 308, in rabbitmqctl
2016-05-19 06:01:59 INFO cluster-relation-changed subprocess.check_call(cmd)
2016-05-19 06:01:59 INFO cluster-relation-changed File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
2016-05-19 06:01:59 INFO cluster-relation-changed raise CalledProcessError(retcode, cmd)
2016-05-19 06:01:59 INFO cluster-relation-changed subprocess.CalledProcessError: Command '['/usr/sbin/rabbitmqctl', 'start_app']' returned non-zero exit status 2
2016-05-19 06:01:59 ERROR juju.worker.uniter.operation runhook.go:107 hook "cluster-relation-changed" failed: exit status 1

Tags: landscape
Revision history for this message
Francis Ginther (fginther) wrote :

Logs from all rabbitmq-server units.

tags: added: landscape
tags: added: kanban-cross-team
tags: removed: kanban-cross-team
description: updated
James Page (james-page)
Changed in rabbitmq-server (Juju Charms Collection):
status: New → Invalid
Revision history for this message
James Page (james-page) wrote :

The message in the logs is generated when the new rabbitmq-server node was unable to merge its internal cluster state with that of the existing clustered units - it looks odd in that one unit appears twice:

 ** FATAL ** Failed to merge schema: {aborted,
                                      {combine_error,schema,
                                       ['rabbit@10-96-9-197',
                                        'rabbit@10-96-9-197',
                                        'rabbit@10-96-9-0',
                                        'rabbit@10-96-8-114']}}

I've not seen this specific issue in testing or production and there are no other similar bug reports so I'm going to mark this as low; we still need to reproduce it.

Changed in charm-rabbitmq-server:
status: New → Incomplete
importance: Undecided → Low
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack rabbitmq-server charm because there has been no activity for 60 days.]

Changed in charm-rabbitmq-server:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.