fails to start

Bug #1808766 reported by Michael Hudson-Doyle on 2018-12-17
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
erlang (Debian)
Fix Released
Unknown
erlang (Ubuntu)
Undecided
Unassigned
rabbitmq-server (Ubuntu)
Undecided
Unassigned

Bug Description

The manila test results are not at all happy on s390x:

http://autopkgtest.ubuntu.com/packages/m/manila/disco/s390x

This turns out to be because rabbitmq-server is failing to start:

Job for rabbitmq-server.service failed because the control process exited with error code.
See "systemctl status rabbitmq-server.service" and "journalctl -xe" for details.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
● rabbitmq-server.service - RabbitMQ Messaging Server
   Loaded: loaded (]8;;file://autopkgtest/lib/systemd/system/rabbitmq-server.service/lib/systemd/system/rabbitmq-server.service]8;;; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Sun 2018-12-16 23:04:30 UTC; 18ms ago
  Process: 6376 ExecStart=/usr/sbin/rabbitmq-server (code=exited, status=1/FAILURE)
 Main PID: 6376 (code=exited, status=1/FAILURE)

Dec 16 23:04:30 autopkgtest systemd[1]: rabbitmq-server.service: Main process exited, code=exited, status=1/FAILURE
Dec 16 23:04:30 autopkgtest systemd[1]: rabbitmq-server.service: Failed with result 'exit-code'.
Dec 16 23:04:30 autopkgtest systemd[1]: Failed to start RabbitMQ Messaging Server.
dpkg: error processing package rabbitmq-server (--configure):
 installed rabbitmq-server package post-installation script subprocess returned error exit status 1

I got cpaelzer to try on a non-autopkgtest machine with the same result.

summary: - fails to start on s390x
+ fails to start
Download full text (5.0 KiB)

It fails to start on install on both, although it fails differently.

The logs (journald) didn't have too much info other than failing, but when I ignore the service and run the binary I get:

x86:
/usr/sbin/rabbitmq-server
ERROR: epmd error for host d: address (cannot connect to host/port)

s390x:
/usr/sbin/rabbitmq-server
2018-12-17 09:17:28.163245
    args: []
    format: "Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only"
    label: {error_logger,error_msg}
2018-12-17 09:17:28.163616 crash_report #{label=>{proc_lib,crash},report=>[[{initial_call,{auth,init,['Argument__1']}},{pid,<0.58.0>},{registered_name,[]},{error_info,{error,"Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}},{ancestors,[net_sup,kernel_sup,<0.46.0>]},{message_queue_len,0},{messages,[]},{links,[<0.56.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,27},{reductions,559}],[]]}
2018-12-17 09:17:28.163996 supervisor_report #{label=>{supervisor,start_error},report=>[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{"Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}},{offender,[{pid,undefined},{id,auth},{mfargs,{auth,start_link,[]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
2018-12-17 09:17:28.166119 supervisor_report #{label=>{supervisor,start_error},report=>[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,{shutdown,{failed_to_start_child,auth,{"Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}},{offender,[{pid,undefined},{id,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
2018-12-17 09:17:28.168806 crash_report #{label=>{proc_lib,crash},report=>[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.45.0>},{registered_name,[]},{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"ge...

Read more...

I can't apport to this bug (damn != reporter restriction) so I filed a duplicate to have the crash data. See bug 1808774 for crash files

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in rabbitmq-server (Ubuntu):
status: New → Confirmed
James Page (james-page) wrote :

I think this is due to the erlang-base package now enabling a shared epmd daemon by default; rabbitmq-server is unable to connect to it over the hostname of the server as it only listens on localhost by default.

James Page (james-page) wrote :

RabbitMQ includes:

After=network.target epmd@0.0.0.0.socket
Wants=network.target epmd@0.0.0.0.socket

in its systemd unit file; however the @ notation on the epmd socket is not compatible with the erlang packaging in Ubuntu and Debian (which only ships @'less versions).

James Page (james-page) wrote :

Raising erlang task as well because we might want to fix things there rather than rmq.

James Page (james-page) wrote :

Proposal to resolve this issue (as I quite like the way RabbitMQ uses @'ed units).

Ship @'less and @'ed epmd.{socket|service} units in the erlang packages, all installed by disable by default.

So epmd does not get started by default; only when another erlang application expresses and interest in it with a After/Wants clause.

This also avoids races on reboot with apps that optionally start their own version of epmd if one is not already running.

James Page (james-page) wrote :

List of potentially impacted packages:

* averell
* ejabberd
* ejabberd-contrib
* ejabberd-mod-cron
* ejabberd-mod-log-chat
* ejabberd-mod-logsession
* ejabberd-mod-logxml
* ejabberd-mod-message-log
* ejabberd-mod-muc-log-http
* ejabberd-mod-post-log
* ejabberd-mod-pottymouth
* ejabberd-mod-rest
* ejabberd-mod-s2s-log
* ejabberd-mod-shcommands
* ejabberd-mod-statsdx
* ejabberd-mod-webpresence
* elixir
* manderlbot
* rabbitmq-server
* rebar
* tsung
* vim-vimerl
* wings3d
* yaws-chat
* yaws-mail
* yaws-wiki

James Page (james-page) wrote :

Alternatively we could drop the

BindToDevice=lo

Stanza in the @'less socket unit

James Page (james-page) wrote :

OR

rabbitmq-server installs without enabling and starting its daemon; requiring the user to reconfigure epmd (or disable it completely).

After validating the different options (sorry for my interim confusion) I think for now the least-invasive solution is the second suggested change. The other options are not as-good IMHO:
- the first adds a lot of Delta to many packages
- the third would make a package require user interaction to work after install

Changes:
1. remove BindToDevice=lo in /lib/systemd/system/epmd.socket of erlang
2. change epmd@0.0.0.0.socket epmd.socket in /lib/systemd/system/rabbitmq-server.service to ensure proper ordering

I checked that in a container and it works for me as well.

That also means that any other "broken by recent erlang" would be fixed for now.
Mid term we can most likely follow Debian on resolving this.

Since initially this was found as s390x issue I retried there as well.
It is still crashing the same way on s390x, so we might have two independent issues here.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package erlang - 1:21.2+dfsg-1ubuntu1

---------------
erlang (1:21.2+dfsg-1ubuntu1) disco; urgency=medium

  * d/epmd.socket: Enable socket for all interfaces, resolving issues
    with startup of RabbitMQ and other clusters erlang services
    (LP: #1808766).

 -- James Page <email address hidden> Tue, 18 Dec 2018 09:17:25 +0000

Changed in erlang (Ubuntu):
status: New → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package rabbitmq-server - 3.7.8-4ubuntu2

---------------
rabbitmq-server (3.7.8-4ubuntu2) disco; urgency=medium

  * Resolve issues with startup of RabbitMQ with erlang provided
    epmd daemon (LP: #1808766):
    - d/rabbitmq-server.service: Wants/After epmd.socket, aligning
      with services actually provided by erlang in Ubuntu.
    - d/control: Bump minimum erlang-* package versions to ensure
      compatibility with epmd.socket configuration.

 -- James Page <email address hidden> Tue, 18 Dec 2018 09:18:27 +0000

Changed in rabbitmq-server (Ubuntu):
status: Confirmed → Fix Released
Changed in erlang (Debian):
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.