rabbitmq-server fails to start if hostname is unresolvable or has changed since first starting

Bug #653405 reported by Takey McTaker on 2010-10-02
98
This bug affects 16 people
Affects Status Importance Assigned to Milestone
rabbitmq-server (Ubuntu)
High
Marc Cluet
Lucid
Undecided
Unassigned
Maverick
Undecided
Unassigned
Natty
High
Marc Cluet
Oneiric
High
Marc Cluet

Bug Description

Binary package hint: rabbitmq-server

This error comes up with every update now.

ProblemType: Package
DistroRelease: Ubuntu 10.04
Package: rabbitmq-server 1.7.2-1ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-24.43-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic x86_64
NonfreeKernelModules: fglrx wl
Architecture: amd64
Date: Fri Oct 1 11:56:28 2010
ErrorMessage: subprocess installed post-installation script returned error exit status 1
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
PackageArchitecture: all
SourcePackage: rabbitmq-server
Title: package rabbitmq-server 1.7.2-1ubuntu1 failed to install/upgrade: subprocess installed post-installation script returned error exit status 1

Takey McTaker (takeymctaker) wrote :
Clint Byrum (clint-fewbar) wrote :

Hi Takey, thanks for taking the time to file this bug report and help us make Ubuntu better!

The DpkgTerminalLog.txt suggests that the details of what caused rabbitmq to fail will be in /var/log/rabbitmq/startup_log

This error message suggests that
Setting up rabbitmq-server (1.7.2-1ubuntu1) ...
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_log, _err
rabbitmq-server.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
dpkg: error processing rabbitmq-server (--configure):
 subprocess installed post-installation script returned error exit status 1

Can you please upload that file, and any others in that directory that might help us understand what caused this failure?

Marking Incomplete pending Takey's response.

Changed in rabbitmq-server (Ubuntu):
status: New → Incomplete
Takey McTaker (takeymctaker) wrote :

The startup_log file is attached. Note that no configuration or setup of RabbitMQ has been edited. These errors started happening immediately after the package was installed via Synaptic Package Manager.

Takey McTaker (takeymctaker) wrote :
Takey McTaker (takeymctaker) wrote :
Chuck Short (zulcss) wrote :

Can you check to see if there is an erl_crash.dump and attach it to this bug report?

thanks
chuck

Changed in rabbitmq-server (Ubuntu):
importance: Undecided → Low
Clint Byrum (clint-fewbar) wrote :

It seems there's already an erl_crash.dump attached.

I really don't know how to read an erl_crash.dump file, but I do see this in it:

only_loaded
'configuration must be a list ended by <dot><whitespace>'

Takey, is it possible your configuration is not formatted correctly?

Takey McTaker (takeymctaker) wrote :

I repeat: "Note that no configuration or setup of RabbitMQ has been edited." If there is a configuration error, it is the fault of Synaptic Package Manager, or a bad package configuration.

Brian Merritt (btmerr) wrote :

On a vanilla EC2 instance running the Alestic AMI ami-880c5ccd with nothing changed except for having multiverse enabled and run 'apt-get update', installation of rabbitmq-server fails in the same fashion. Seems to me the package is broken.

root@host:~# apt-get install rabbitmq-server
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  rabbitmq-server
0 upgraded, 1 newly installed, 0 to remove and 7 not upgraded.
Need to get 0B/560kB of archives.
After this operation, 1,069kB of additional disk space will be used.
Preconfiguring packages ...
Selecting previously deselected package rabbitmq-server.
(Reading database ... 34794 files and directories currently installed.)
Unpacking rabbitmq-server (from .../rabbitmq-server_1.7.2-1ubuntu1_all.deb) ...
Processing triggers for man-db ...
Processing triggers for ureadahead ...
Setting up rabbitmq-server (1.7.2-1ubuntu1) ...
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_log, _err
rabbitmq-server.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
dpkg: error processing rabbitmq-server (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 rabbitmq-server
E: Sub-process /usr/bin/dpkg returned an error code (1)
root@host:~#

Clint Byrum (clint-fewbar) wrote :

Marking confirmed since there are 2 reports now.

Changed in rabbitmq-server (Ubuntu):
status: Incomplete → Confirmed
Takey McTaker (takeymctaker) wrote :

I have removed this package in order to avoid the error messages coming up after every other update or package installation. It's annoying. Package installations shouldn't require any form of user intervention until the point local customization is needed. I'll be installing RabbitMQ from source instead of apt-get in the future.

Brian Merritt (btmerr) wrote :

I figured out that this failure mode happens when the hostname of the server/node/instance that you're installing on is not resolvable. Simply adding the hostname to /etc/hosts (or DNS, etc.) will resolve this issue.

Clint Byrum (clint-fewbar) wrote :

Brian, thanks for looking into that closer. I've raised the Importance to Medium as the cross section of users is probably bigger than originally assessed.

Changed in rabbitmq-server (Ubuntu):
importance: Low → Medium
Cerin (chrisspen) wrote :

This error is interfering with the installation of other packages.

Trying to install or upgrade any package causes rabbitmq-server's buggy post-installation script to be re-run, which always fails, and this failure seems to block the installation of some unrelated packages.

For example, I tried installing the MySQL Workbench .deb (available from http://dev.mysql.com/downloads/workbench/), and it failed until I first purged the rabbitmq-server package.

Clint Byrum (clint-fewbar) wrote :

I've confirmed, rabbitmq in Lucid (1.7.2) has serious problems starting when the hostname is unresolvable.

I set my hostname to several non existent hostnames, and each time I got this on installation of rabbitmq-server:

Setting up rabbitmq-server (1.7.2-1ubuntu1) ...
Adding group `rabbitmq' (GID 131) ...
Done.
Adding system user `rabbitmq' (UID 121) ...
Adding new user `rabbitmq' (UID 121) with group `rabbitmq' ...
Not creating home directory `/var/lib/rabbitmq'.
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_log, _err
rabbitmq-server.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
dpkg: error processing rabbitmq-server (--configure):
 subprocess installed post-installation script returned error exit status 1
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
Errors were encountered while processing:
 rabbitmq-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

This also seems to break rabbitmq for subsequent restarts if you change the hostname.

Marking Confirmed.

Clint Byrum (clint-fewbar) wrote :

Oops, I meant to say raising importance to High, as this seems to make the package rather unstable.

Changed in rabbitmq-server (Ubuntu):
importance: Medium → High
summary: - package rabbitmq-server 1.7.2-1ubuntu1 failed to install/upgrade:
- subprocess installed post-installation script returned error exit status
- 1
+ rabbitmq-server fails to start if hostname is unresolvable

In a discussion in #rabbitmq on Freenode, the users there informed me that there are actually two problems at play here.

Comment #3's log indicates that something was already listening on rabbitmq's port, presumably rabbitmq. It does not indicate that there was a failure looking up hostname.

What really seems to be the big issue is that if hostname changes, rabbitmq cannot be restarted, nor can it be effectively queried for status. This is because somewhere in /var/lib/rabbitmq, the hostname is stored.

This actually means that any time hostname changes, rabbitmq cannot be restarted without clearing all of its persistent storage.

Ouch.

Its not clear that we can fix that, but we can at least make the init.d script provide some useful warnings on the conditions that:

* hostname has changed from the original (rabbitmqctl status will predictibly fail in this instance)
* hostname is unresolvable (a simple host lookup will do)

Instructing users to change the hostname back, and/or change it to something resolvable, would probably be the best way to go.

For now, a workaround is to run

rabbitmqctl status

If this fails similar to this message:
# rabbitmqctl status
Status of node rabbit@foo ...
Error: unable to connect to node rabbit@foo: nodedown
diagnostics:
- unable to connect to epmd on foo: nxdomain
- current node: rabbitmqctl612@foo
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==

The hostname is probably unresolvable.. however, if there is at least a listing of 'nodes and their ports on XXX', like this:

# rabbitmqctl status
Status of node rabbit@localhost ...
Error: unable to connect to node rabbit@localhost: nodedown
diagnostics:
- nodes and their ports on localhost: [{rabbitmqctl640,53615}]
- current node: rabbitmqctl640@localhost
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==

Then the hostname is resolvable, but rabbitmq is not running.

If, however, you have changed your hostname, you will get this error:

# rabbitmqctl status
Status of node 'rabbit@clint-MacBookPro' ...
Error: unable to connect to node 'rabbit@clint-MacBookPro': nodedown
diagnostics:

=ERROR REPORT==== 15-Dec-2010::11:06:16 ===
Error in process <0.36.0> on node 'rabbitmqctl781@clint-MacBookPro' with exit value: {badarg,[{erlang,list_to_existing_atom,["rabbit@localhost"]},{dist_util,recv_challenge,1},{dist_util,handshake_we_started,1}]}

- nodes and their ports on clint-MacBookPro: [{rabbit,53426},
                                              {rabbitmqctl781,56544}]
- current node: 'rabbitmqctl781@clint-MacBookPro'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: kaX/5T0iLfPB+YcqbsHCJA==

Note that it mentions list_to_existing_atom, ["rabbit@localhost"] .. after @ is the old node name, and the one that you must set your hostname to in order to start rabbitmq again. I will set about trying to hack that into the init.d script to at least give people a fighting chance to correct their hostname.

Also, if you try to start rabbitmq and you get TIMEOUT .. thats caused by having changed the hostname as well

summary: - rabbitmq-server fails to start if hostname is unresolvable
+ rabbitmq-server fails to start if hostname is unresolvable or has
+ changed since first starting
Cerin (chrisspen) wrote :

I also encounter the same problem using the most recent rabbitmq-server_2.2.0-1_all.deb, so the problem's not isolated to the older 1.7.2 version.

Clint Byrum (clint-fewbar) wrote :

Cerin, is your hostname resolvable (ping `hostname`) ? Also did you try the rabbitmqctl tests I mentioned above?

Cerin (chrisspen) wrote :

Clint, it was resolving, but I had two entries for localhost in my hosts file, so "ping currenthostname" was coming back as "oldhostname", and I guess this was throwing off rabbitmq. After I removed the first entry, I was able to install the latest deb without error.

Clint Byrum (clint-fewbar) wrote :

I forwarded the link to this bug to <email address hidden> .. will set the status accordingly upon response.

Clint Byrum (clint-fewbar) wrote :

I received this very detailed response from Simon MacMullen of RabbitMQ, reposting here w/ his permission.

It seems that given the response below, neither issue will be easily fixable in versions of RabbitMQ prior to 2.2.0 or older versions of erlang.

I think this one will probably have to be a Won't Fix for versions prior to those mentioned below.

Marking Triaged.

Note that there is a merge request for the version of erlang that should fix the problem in bug #690068

=== Paste from email ===

On 25/01/11 14:31, Clint Byrum wrote:
> This bug has been reported in Ubuntu. Hopefully you guys already have it
> in the internal bug tracker:
>
> https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/653405

Hi Clint. There are two independent bugs being reported there:

1) RabbitMQ 1.7.2 fails to start when the hostname has changed.
2) RabbitMQ 1.7.2 fails to start when the hostname cannot be resolved.

The first issue is caused by the fact that we use Erlang's database,
Mnesia internally, and that stores the hostname everywhere (since it's
designed for distributed use).

RabbitMQ 2.2.0 contains a workaround for this - the path to Mnesia now
contains the hostname, so if the hostname changes the old database will
not be found and a new one will be created. I'm not absolutely sure
where that was introduced between 1.7.2 and 2.2.0.

The second issue is an issue with Erlang's port mapper; epmd. We depend
on this (any Erlang app will), and it won't start if the hostname cannot
be resolved. This bug appears to exist in Erlang R13B3 and be fixed in
R14B. Again, I'm not sure exactly where the fix was introduced.

Changed in rabbitmq-server (Ubuntu):
status: Confirmed → Triaged
Curtis Hovey (sinzui) wrote :

This bug also affects package upgrade scripts.

Marc Cluet (lynxman) on 2011-04-29
Changed in rabbitmq-server (Ubuntu):
assignee: nobody → Marc Cluet (lynxman)
Changed in rabbitmq-server (Ubuntu Natty):
importance: Undecided → High
status: New → Triaged
assignee: nobody → Marc Cluet (lynxman)
milestone: none → natty-updates
Marc Cluet (lynxman) wrote :

Small patch that addes a check in rabbitmq-server init script to check if hostname is resolvable

tags: added: patch
Marc Cluet (lynxman) wrote :

Here's the debdiff based on the same patch

Martin Pitt (pitti) wrote :

Eww, automatically changing configuration files in maintainer scripts? That looks like a no-go to me. It does not even check if the IP already exists in /etc/hosts, or whether the file is used at all (in nsswitch.conf).

Can we instead just make the package generate an error debconf note if /etc/hosts is misconfigured and fail gracefully?

Martin Pitt (pitti) wrote :

I also suggest to forward this bug to Debian and discuss it with the maintainer there, to avoid a permantently different solution in Ubuntu.

I reject the current upload for natty-proposed.

Thank you!

Dustin Kirkland  (kirkland) wrote :

It seems to me that the real problem is that the machine was installed with a partial /etc/hosts file.

I did an install yesterday of an 11.04 desktop, where I have exactly the 127.0.1.1 x201 entry I expected.

If we're missing that in server or cloud installations, we need to get to the bottom of that. What package or process creates the original /etc/hosts? I looked at base-files, rootskel, and isc-dhcp, but haven't found it yet...

Clint Byrum (clint-fewbar) wrote :

Spoke in person with a user who showed me how this is confirmed on maverick and lucid, marking as such.

Changed in rabbitmq-server (Ubuntu Lucid):
status: New → Confirmed
Changed in rabbitmq-server (Ubuntu Maverick):
status: New → Confirmed
tags: added: rls-mgr-o-tracking
Dave Walker (davewalker) on 2011-09-16
tags: added: server-o-rs
Clint Byrum (clint-fewbar) wrote :

Can anyone confirm that this still happens on oneiric? It seems we have R14B of erlang now, and version 2.5.0 of RabbitMQ, so according to upstream, there should be no further issue.

Dave Walker (davewalker) on 2011-09-23
Changed in rabbitmq-server (Ubuntu):
milestone: none → ubuntu-11.10
Dave Walker (davewalker) wrote :

Marking incomplete, until it is confirmed that this is still a bug.

Thanks.

Changed in rabbitmq-server (Ubuntu Oneiric):
status: Triaged → Incomplete
Clint Byrum (clint-fewbar) wrote :

I just confirmed that rabbitmq-server in oneiric will start and stop with an unresolvable hostname, and handles hostname changes more gracefully ( though not perfectly, still working up a bug report on the quirk that the new mnesia changes have created ).

Changed in rabbitmq-server (Ubuntu Oneiric):
status: Incomplete → Fix Released
Emanuel Vianna (emanuelvianna) wrote :

Reading this bug report I found an error in my /etc/hosts which resolves the problem here.
Thanks.

Using:
Ubuntu 10.04
Erlang 5.7.4
RabbitMQ Server 2.5.0-1

Oscar E. Ganteaume (oegbizz) wrote :

I have Ubuntu 11.10 and have been trying to install rabbitmq 2.7.1 without any luck. I get the following error message:

The following NEW packages will be installed:
  rabbitmq-server
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/2,711 kB of archives.
After this operation, 3,736 kB of additional disk space will be used.
Selecting previously deselected package rabbitmq-server.
(Reading database ... 409938 files and directories currently installed.)
Unpacking rabbitmq-server (from .../rabbitmq-server_2.7.1-1_all.deb) ...
Processing triggers for man-db ...
Processing triggers for ureadahead ...
Setting up rabbitmq-server (2.7.1-1) ...
Adding group `rabbitmq' (GID 129) ...
Done.
Adding system user `rabbitmq' (UID 118) ...
Adding new user `rabbitmq' (UID 118) with group `rabbitmq' ...
Not creating home directory `/var/lib/rabbitmq'.
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_{log, _err}
rabbitmq-server.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
dpkg: error processing rabbitmq-server (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 rabbitmq-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

Based on previous posts I have attached the erl_crash.dump

The /var/log/rabbitmq/startup_err is empty, but the /var/log/rabbitmq/startup_log contains the following:
Activating RabbitMQ plugins ...
0 plugins activated:

ERROR: epmd error for host "192": badarg (unknown POSIX error)

In case this helps, I am running AMD 64 bit architecture. I really have tried different options, but none of them have worked. Any help will be greatly appreciated. Thanks in advance

Oscar

I had the same problem, i resolve it putting the local ip with the hostname in /etc/hosts.
rafa

Les Dunston (ldunston) wrote :

Just to clarify a little, I had to place the short hostname on localhost and not the FQDN in /etc/hosts to get rabbitmq to start up.

My entry looks something like this:

127.0.0.1 localhost servername servername.example.com

Installing on Ubuntu 10.04 fails if the hostname does not resolve.

For the benefit for people arriving here via search engines, here's what I had to do:
   apt-get uninstall rabbitmq-server
   vi /etc/hostname # fix the hostname so it resolves
   rm -rf /var/lib/rabbitmq # the old broken hostname is still somewhere here
   reboot
   apt-get install rabbitmq-server

shahar (shahar7000) wrote :

In case it's relevant here:

I installed rabbitmq today (2013-02-20).
 - My machine's hostname is resolved also before installation, and it didn't change since the server was first installed few days ago.
 - in any case, i revised my /etc/hosts - it seems to be 100% ok (just in case)
 - i'm on Ubuntu 12.04, 64bit.
 - it's hosted inside VirtualBox
 - The log files mentioned in the files are empty (0 bytes)
 - erl_crash.dump is also empty.
 - I repeated the installation process (apt-get remove, autoremove, clean, autoclean, and then install) , to no avail.

after installation i (re)tried to start the server manually:
# service rabbitmq-server start
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_{log, _err}
rabbitmq-server.
#

according to someone here, i ran "rabbitmqctl status", and got all funny messages:
# rabbitmqctl status
{error_logger,{{2013,2,20},{13,3,37}},"Too short cookie string",[]}
{error_logger,{{2013,2,20},{13,3,37}},crash_report,[[{initial_call,{auth,init,['Argument__1']}},{pid,<0.19.0>},{registered_name,[]},{error_info,{exit,{"Too short cookie string",[{auth,init_cookie,0},{auth,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]},[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[<0.17.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,24},{reductions,827}],[]]}
{error_logger,{{2013,2,20},{13,3,37}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{"Too short cookie string",[{auth,init_cookie,0},{auth,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{offender,[{pid,undefined},{name,auth},{mfargs,{auth,start_link,[]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2013,2,20},{13,3,37}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2013,2,20},{13,3,37}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})

still unresolved, and no workaround found.
it's probably another misreported bug, and i'm still looking for solutions.

dino99 (9d9) wrote :
Changed in rabbitmq-server (Ubuntu Natty):
status: Triaged → Invalid
Changed in rabbitmq-server (Ubuntu Maverick):
status: Confirmed → Invalid
hexes (ivan-s-borisov) wrote :

cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"

uname -a
Linux hexes-book 3.13.0-46-generic #77-Ubuntu SMP Mon Mar 2 18:23:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

sudo dpkg --configure -a
Настраивается пакет rabbitmq-server (3.2.4-1) …
 * Starting message broker rabbitmq-server * FAILED - check /var/log/rabbitmq/startup_\{log, _err\}
                                                                         [fail]
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
dpkg: error processing package rabbitmq-server (--configure):
 подпроцесс установлен сценарий post-installation возвратил код ошибки 1
При обработке следующих пакетов произошли ошибки:
 rabbitmq-server

erl -sname foo
{error_logger,{{2015,3,3},{21,43,40}},"Protocol: ~tp: register/listen error: ~tp~n",["inet_tcp",econnrefused]}

cat /etc/hosts
127.0.0.1 hexes-book localhost

hostname
hexes-book

dino99 (9d9) on 2015-03-03
Changed in rabbitmq-server (Ubuntu Lucid):
status: Confirmed → Fix Released
Rupesh Chowdhary (rupeshrams) wrote :

Login to the fuel master node and check hostname: hostname should be - fuel. While fuel node installation in fuelmenu settings default hostname will come as 'fuel' d't change the name as customize that will impact the rabbitmq-server service .

Check the fuel hosts entried in /etc/hosts file.

If the hostname changed as customize revertback the hostname fuel and restart the rabbitmq services.

#service rabbitmq-server restart

Caberset (caberset) wrote :

Just for be sure, take a look to your local network

# ip add

If there's no "lo" network, the you should enable it:

# ifconfig lo up

Then restart the server again and let's see if it works again now

# systemctl start rabbitmq-server

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers