kolla deploy fails due to continuous restart of rabbitmq

Bug #1840369 reported by Fabio Della Giustina
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
kolla-ansible
New
Undecided
Unassigned

Bug Description

ansible version: 2.8.3
kolla-ansible version: 8.0.0
kolla_base_distro: "centos"
kolla_install_type: "source"
openstack_release: "stein"
all-in-one deploy

I followed https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html steps.

During kolla deploy, rabbitmq container keeps restarting and deploy fails.

This is the error:
RUNNING HANDLER [rabbitmq : Waiting for rabbitmq to start on first node]
fatal: [localhost]: FAILED! => {
  "changed": true,
  "cmd": "docker exec rabbitmq rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbitmq.pid",
  "delta": "0:00:00.468144",
  "end": "2019-08-15 22:36:06.266080",
  "msg": "non-zero return code",
  "rc": 137,
  "start": "2019-08-15 22:36:05.797936",
  "stderr": "",
  "stderr_lines": [],
  "stdout": "",
  "stdout_lines": []
}

These are rabbitmq container logs:
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq-env.conf to /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq.conf to /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Deleting /etc/rabbitmq/definitions.json
INFO:__main__:Copying /var/lib/kolla/config_files/definitions.json to /etc/rabbitmq/definitions.json
INFO:__main__:Setting permission for /etc/rabbitmq/definitions.json
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/lib/rabbitmq
INFO:__main__:Setting permission for /var/lib/rabbitmq/config
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia
INFO:__main__:Setting permission for /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema/rabbit.schema
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia/rabbitmq.pid
INFO:__main__:Setting permission for /var/log/kolla/rabbitmq
++ cat /run_command
+ CMD=/usr/sbin/rabbitmq-server
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ : /var/log/kolla/rabbitmq
++ [[ -n '' ]]
++ [[ ! -d /var/log/kolla/rabbitmq ]]
+++ stat -c %a /var/log/kolla/rabbitmq
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/rabbitmq
+ echo 'Running command: '\''/usr/sbin/rabbitmq-server'\'''
Running command: '/usr/sbin/rabbitmq-server'
+ exec /usr/sbin/rabbitmq-server
ERROR: epmd error for host Fabio-Blade: address (cannot connect to host/port)

I don't know what kind of error it is (prechecks are ok).

Revision history for this message
Mark Goddard (mgoddard) wrote :

Do you have a firewall running on the machine?

Revision history for this message
Fabio Della Giustina (fabiodellagiustina) wrote :

I'm running Ubuntu Desktop 18.04.3 without any firewall active.

fabio@Fabio-Blade:~$ sudo ufw status
Status: inactive

Revision history for this message
Ondrej Melichar (ondrejm) wrote :

Any chance you forgot to add host Fabio-Blade to /etc/hosts?

I would try ´epmd -d´ for more information in debug mode, and also check rabbitmq-env.conf for possible misconfigurations. This might also happen if rabbitmq is unable to log.

Revision history for this message
Fabio Della Giustina (fabiodellagiustina) wrote :

My /etc/hosts file looks like this:
127.0.0.1 localhost
127.0.1.1 Fabio-Blade
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
# BEGIN ANSIBLE GENERATED HOSTS
192.168.1.101 Fabio-Blade
# END ANSIBLE GENERATED HOSTS

This is 'epmd -d' output:
epmd: Tue Aug 20 17:48:21 2019: epmd running - daemon = 0
epmd: Tue Aug 20 17:48:43 2019: ** got ALIVE2_REQ
epmd: Tue Aug 20 17:48:43 2019: registering 'rabbitmqprelaunch6:2', port 39529
epmd: Tue Aug 20 17:48:43 2019: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Tue Aug 20 17:48:43 2019: ** sent ALIVE2_RESP for "rabbitmqprelaunch6"
epmd: Tue Aug 20 17:48:43 2019: ** got NAMES_REQ
epmd: Tue Aug 20 17:48:43 2019: ** sent NAMES_RESP
epmd: Tue Aug 20 17:48:43 2019: unregistering 'rabbitmqprelaunch6:2', port 39529
epmd: Tue Aug 20 17:48:43 2019: ** got ALIVE2_REQ
epmd: Tue Aug 20 17:48:43 2019: registering 'rabbit:2', port 25672
epmd: Tue Aug 20 17:48:43 2019: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Tue Aug 20 17:48:43 2019: ** sent ALIVE2_RESP for "rabbit"
epmd: Tue Aug 20 17:48:48 2019: ** got PORT2_REQ
epmd: Tue Aug 20 17:48:48 2019: ** sent PORT2_RESP (ok) for "rabbit"
epmd: Tue Aug 20 17:48:48 2019: ** got PORT2_REQ
epmd: Tue Aug 20 17:48:48 2019: ** sent PORT2_RESP (ok) for "rabbit"
epmd: Tue Aug 20 17:49:48 2019: ** got PORT2_REQ
epmd: Tue Aug 20 17:49:48 2019: ** sent PORT2_RESP (ok) for "rabbit"
epmd: Tue Aug 20 17:49:48 2019: ** got ALIVE2_REQ
epmd: Tue Aug 20 17:49:48 2019: registering 'epmd-starter-125644922:1', port 38649
epmd: Tue Aug 20 17:49:48 2019: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Tue Aug 20 17:49:48 2019: ** sent ALIVE2_RESP for "epmd-starter-125644922"
epmd: Tue Aug 20 17:49:48 2019: unregistering 'epmd-starter-125644922:1', port 38649
epmd: Tue Aug 20 17:50:48 2019: ** got PORT2_REQ
epmd: Tue Aug 20 17:50:48 2019: ** sent PORT2_RESP (ok) for "rabbit"
epmd: Tue Aug 20 17:50:48 2019: ** got ALIVE2_REQ
epmd: Tue Aug 20 17:50:48 2019: registering 'epmd-starter-603360435:1', port 44575
epmd: Tue Aug 20 17:50:48 2019: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Tue Aug 20 17:50:48 2019: ** sent ALIVE2_RESP for "epmd-starter-603360435"
epmd: Tue Aug 20 17:50:48 2019: unregistering 'epmd-starter-603360435:1', port 44575
epmd: Tue Aug 20 17:51:48 2019: ** got PORT2_REQ
epmd: Tue Aug 20 17:51:48 2019: ** sent PORT2_RESP (ok) for "rabbit"
epmd: Tue Aug 20 17:51:49 2019: ** got ALIVE2_REQ
epmd: Tue Aug 20 17:51:49 2019: registering 'epmd-starter-947328378:2', port 37119
epmd: Tue Aug 20 17:51:49 2019: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Tue Aug 20 17:51:49 2019: ** sent ALIVE2_RESP for "epmd-starter-947328378"
epmd: Tue Aug 20 17:51:49 2019: unregistering 'epmd-starter-947328378:2', port 37119

/etc/kolla/rabbitmq/rabbitmq-env.conf file:
RABBITMQ_NODENAME=rabbit@Fabio-Blade
RABBITMQ_LOG_BASE=/var/log/kolla/rabbitmq
RABBITMQ_DIST_PORT=25672
RABBITMQ_PID_FILE=/var/lib/rabbitmq/mnesia/rabbitmq.pid
export ERL_EPMD_ADDRESS=192.168.1.101
export ERL_EPMD_PORT=4369

Do I have to configure anything here?

Revision history for this message
Fabio Della Giustina (fabiodellagiustina) wrote :

Apparently, if I run 'epmd -d', rabbitmq container starts up.
After a second deploy without destroying everything, it works all fine.
Tested more than once.

So, with a fresh deploy still doesn't work.

Mark Goddard (mgoddard)
affects: kolla → kolla-ansible
Revision history for this message
Mark Goddard (mgoddard) wrote :

This is a duplicate of bug 1837699. The presence of a hosts entry pointing to localhost breaks RabbitMQ:

/etc/hosts
127.0.1.1 Fabio-Blade

Revision history for this message
Fabio Della Giustina (fabiodellagiustina) wrote :

Thanks!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.