rabbitMQ deploy always fails 'non-zero return code'

Bug #1877107 reported by Paul Chattaway
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Invalid
Undecided
Unassigned

Bug Description

Hi,

RabbitMQ containers fails to start and is stuck in restarting on minimal CentOS Linux release 8.1.1911 build for kolla-ansible Version: 9.1.0.

Client: Docker Engine - Community
 Version: 19.03.8
 API version: 1.40
 Go version: go1.12.17
 Git commit: afacb8b
 Built: Wed Mar 11 01:27:04 2020
 OS/Arch: linux/amd64
 Experimental: false

Server: Docker Engine - Community
 Engine:
  Version: 19.03.8
  API version: 1.40 (minimum version 1.12)

The RabbitMQ container build errors

TASK [rabbitmq : Check rabbitmq containers] ***********************************************************************************************************************************************
changed: [localhost] => (item={'key': 'rabbitmq', 'value': {'container_name': 'rabbitmq', 'group': 'rabbitmq', 'enabled': True, 'image': 'kolla/centos-source-rabbitmq:train-centos8', 'bootstrap_environment': {'KOLLA_BOOTSTRAP': None, 'KOLLA_CONFIG_STRATEGY': 'COPY_ALWAYS', 'RABBITMQ_CLUSTER_COOKIE': 'VPaoAmxzmKXbpuqb1uGq6YY5tNvZ7CFenYWyovxd', 'RABBITMQ_LOG_DIR': '/var/log/kolla/rabbitmq'}, 'environment': {'KOLLA_CONFIG_STRATEGY': 'COPY_ALWAYS', 'RABBITMQ_CLUSTER_COOKIE': 'VPaoAmxzmKXbpuqb1uGq6YY5tNvZ7CFenYWyovxd', 'RABBITMQ_LOG_DIR': '/var/log/kolla/rabbitmq'}, 'volumes': ['/etc/kolla/rabbitmq/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '', 'rabbitmq:/var/lib/rabbitmq/', 'kolla_logs:/var/log/kolla/'], 'dimensions': {}, 'haproxy': {'rabbitmq_management': {'enabled': 'yes', 'mode': 'http', 'port': '15672', 'host_group': 'rabbitmq'}, 'rabbitmq_outward_management': {'enabled': False, 'mode': 'http', 'port': '15674', 'host_group': 'outward-rabbitmq'}, 'rabbitmq_outward_external': {'enabled': False, 'mode': 'tcp', 'external': True, 'port': '5674', 'host_group': 'outward-rabbitmq', 'frontend_tcp_extra': ['timeout client 1h'], 'backend_tcp_extra': ['timeout server 1h']}}}})

TASK [rabbitmq : include_tasks] ***********************************************************************************************************************************************************
included: /usr/local/share/kolla-ansible/ansible/roles/rabbitmq/tasks/bootstrap.yml for localhost

TASK [rabbitmq : Creating rabbitmq volume] ************************************************************************************************************************************************
ok: [localhost]

TASK [rabbitmq : Running RabbitMQ bootstrap container] ************************************************************************************************************************************
skipping: [localhost]

RUNNING HANDLER [rabbitmq : Restart rabbitmq container (first node)] **********************************************************************************************************************
changed: [localhost]

RUNNING HANDLER [rabbitmq : Waiting for rabbitmq to start on first node] ******************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "docker exec rabbitmq rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbitmq.pid", "delta": "0:00:01.477935", "end": "2020-05-06 13:28:04.299515", "msg": "non-zero return code", "rc": 137, "start": "2020-05-06 13:28:02.821580", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

I've tried to destroy and recreate, cleans up docker and contained form /var/lib and /etc. Nothing seems to help.

Docker logs from the container show econnrefused, no selinux or iptables is running.

++ cat /run_command
+ CMD=/usr/sbin/rabbitmq-server
+ ARGS=
+ sudo kolla_copy_cacerts
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ : /var/log/kolla/rabbitmq
++ [[ -n '' ]]
++ [[ ! -d /var/log/kolla/rabbitmq ]]
+++ stat -c %a /var/log/kolla/rabbitmq
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/rabbitmq
+ echo 'Running command: '\''/usr/sbin/rabbitmq-server'\'''
+ exec /usr/sbin/rabbitmq-server
Running command: '/usr/sbin/rabbitmq-server'
econnrefused
Protocol 'inet_tcp': register/listen error: + sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq-env.conf to /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq.conf to /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Deleting /etc/rabbitmq/erl_inetrc
INFO:__main__:Copying /var/lib/kolla/config_files/erl_inetrc to /etc/rabbitmq/erl_inetrc
INFO:__main__:Setting permission for /etc/rabbitmq/erl_inetrc
INFO:__main__:Deleting /etc/rabbitmq/definitions.json
INFO:__main__:Copying /var/lib/kolla/config_files/definitions.json to /etc/rabbitmq/definitions.json
INFO:__main__:Setting permission for /etc/rabbitmq/definitions.json
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/lib/rabbitmq
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema
INFO:__main__:Setting permission for /var/lib/rabbitmq/config
INFO:__main__:Setting permission for /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia/rabbitmq.pid
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema/rabbit.schema
INFO:__main__:Setting permission for /var/log/kolla/rabbitmq
++ cat /run_command
+ CMD=/usr/sbin/rabbitmq-server
+ ARGS=
+ sudo kolla_copy_cacerts
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ : /var/log/kolla/rabbitmq
++ [[ -n '' ]]
++ [[ ! -d /var/log/kolla/rabbitmq ]]
+++ stat -c %a /var/log/kolla/rabbitmq
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/rabbitmq
+ echo 'Running command: '\''/usr/sbin/rabbitmq-server'\'''
+ exec /usr/sbin/rabbitmq-server
Running command: '/usr/sbin/rabbitmq-server'
econnrefused
Protocol 'inet_tcp': register/listen error: + sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq-env.conf to /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq.conf to /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq.conf
INFO:__main__:Deleting /etc/rabbitmq/erl_inetrc
INFO:__main__:Copying /var/lib/kolla/config_files/erl_inetrc to /etc/rabbitmq/erl_inetrc
INFO:__main__:Setting permission for /etc/rabbitmq/erl_inetrc
INFO:__main__:Deleting /etc/rabbitmq/definitions.json
INFO:__main__:Copying /var/lib/kolla/config_files/definitions.json to /etc/rabbitmq/definitions.json
INFO:__main__:Setting permission for /etc/rabbitmq/definitions.json
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/lib/rabbitmq
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema
INFO:__main__:Setting permission for /var/lib/rabbitmq/config
INFO:__main__:Setting permission for /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Setting permission for /var/lib/rabbitmq/mnesia/rabbitmq.pid
INFO:__main__:Setting permission for /var/lib/rabbitmq/schema/rabbit.schema
INFO:__main__:Setting permission for /var/log/kolla/rabbitmq
++ cat /run_command
+ CMD=/usr/sbin/rabbitmq-server
+ ARGS=
+ sudo kolla_copy_cacerts
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ : /var/log/kolla/rabbitmq
++ [[ -n '' ]]
++ [[ ! -d /var/log/kolla/rabbitmq ]]
+++ stat -c %a /var/log/kolla/rabbitmq
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/rabbitmq
Running command: '/usr/sbin/rabbitmq-server'
+ echo 'Running command: '\''/usr/sbin/rabbitmq-server'\'''
+ exec /usr/sbin/rabbitmq-server
econnrefused

globals.yml is as follows, I've turned a lot of things to no to keep it minimal.

##########
# Telegraf
##########
# Configure telegraf to use the docker daemon itself as an input for
# telemetry data.
#telegraf_enable_docker_input: "no"
docker_namespace: "kolla"
enable_glance: "no"
enable_haproxy: "no"
enable_keystone: "no"
enable_mariadb: "no"
enable_memcached: "no"
enable_neutron: "no"
enable_nova: "no"
enable_barbican: "no"
enable_mistral: "no"
enable_tacker: "no"
enable_heat: "no"
enable_openvswitch: "no"
enable_horizon: "no"
enable_horizon_tacker: "{{ enable_tacker | bool }}"
enable_magnum: "no"
enable_manila: "no"
#cinder_backend_ceph: "{{ enable_ceph }}"
#cinder_volume_group: "cinder-volumes"
#cinder_backup_driver: "ceph"
#enable_ceph_dashboard: "{{ enable_ceph | bool }}"
enable_chrony: "no"
enable_cinder: "yes"
enable_rabbitmq: "yes"
enable_rabbitmq: "{{ 'yes' if om_rpc_transport == 'rabbit' or om_notify_transport == 'rabbit' else 'no' }}"
enable_ceilometer: "no"
network_interface: "enp3s0f1"
openstack_logging_debug: "True"
enable_openstack_core: "yes"
enable_aodh: "no"
enable_barbican: "no"
enable_ceilometer: "no"
enable_collectd: "no"
enable_elasticsearch: "{{ 'yes' if enable_central_logging | bool or enable_osprofiler | bool or enable_skydive | bool or enable_monasca | bool else 'no' }}"
#enable_etcd: "no"
enable_fluentd: "no"
enable_freezer: "no"
enable_gnocchi: "no"
enable_grafana: "no"
enable_heat: "{{ enable_openstack_core | bool }}"
enable_horizon: "{{ enable_openstack_core | bool }}"
enable_horizon_blazar: "{{ enable_blazar | bool }}"
enable_horizon_freezer: "{{ enable_freezer | bool }}"
enable_horizon_heat: "{{ enable_heat | bool }}"
enable_horizon_magnum: "{{ enable_magnum | bool }}"
enable_horizon_mistral: "{{ enable_mistral | bool }}"
enable_horizon_octavia: "{{ enable_octavia | bool }}"
enable_horizon_sahara: "{{ enable_sahara | bool }}"
enable_horizon_zun: "{{ enable_zun | bool }}"
enable_kafka: "{{ enable_monasca | bool }}"
enable_kibana: "{{ 'yes' if enable_central_logging | bool or enable_monasca | bool else 'no' }}"
enable_mistral: "no"
enable_octavia: "no"
enable_placement: "{{ enable_nova | bool or enable_zun | bool }}"
enable_sahara: "no"
enable_swift: "no"
enable_horizon_zun: "{{ enable_zun | bool }}

Revision history for this message
Paul Chattaway (mppace) wrote :

Possibly might have fixed it by deleting /var/lib/docker/volumes/rabbitmq/_data/.erlang.cookie before deploy.

I will advise if it does and leave this for others as a reference.

Revision history for this message
Paul Chattaway (mppace) wrote :

None issue I think, dodgy OS rather then kolla, my guess.
The destroy command wasn't clearing out the docker container and volume properly, the destroy should remove the erlang.cookie as part of the removal. Removing that allowed rabbitmq to progress but there was further issues along the line with Mariadb and swift. Heplful commands was docker container ls and docker volumes ls, these should be empty post 'destroy'and weren't. Long story short, I burned the OS and started fresh and things are running ok, and deploying and destroying ok. I have cleaned up the globals.yml a bit, so my current working global.yml is as below. 2 things of note. Swift is disabled as that errors, and also monasca because of no available Elasticsearch centos 8 package. I've not looked at why yet, but the below works on a one box build with 2 nics, just change the ip's and nics. Bear in mind i've only tested to install completion and login through DASH. Not testing funcionality.

kolla_install_type: "source"
kolla_internal_vip_address: "10.1.1.11"
kolla_external_vip_address: "10.2.1.1"
kolla_external_vip_interface: "enp2s0f0"
network_interface: "enp3s0f1"
docker_namespace: "kolla"
enable_haproxy: "no"
enable_keystone: "yes"
enable_mariadb: "yes"
enable_memcached: "yes"
enable_neutron: "yes"
enable_nova: "yes"
enable_barbican: "yes"
enable_mistral: "yes"
enable_tacker: "yes"
enable_heat: "yes"
enable_openvswitch: "no"
enable_horizon: "yes"
enable_magnum: "yes"
enable_manila: "yes"
enable_chrony: "yes"
enable_cinder: "yes"
enable_rabbitmq: "yes"
enable_ceilometer: "yes"
openstack_logging_debug: "True"
enable_openstack_core: "yes"
enable_aodh: "yes"
#enable_elasticsearch: "{{ 'yes' if enable_central_logging | bool or enable_osprofiler | bool or enable_skydive | bool or enable_monasca | bool else 'no' }}"
#enable_etcd: "no"
enable_fluentd: "no"
enable_freezer: "yes"
enable_gnocchi: "yes"
enable_grafana: "no"
enable_mistral: "yes"
enable_octavia: "yes"
enable_sahara: "yes"
enable_monasca: "no"
enable_cells: "yes"
enable_barbican: "yes"
enable_glance: "yes"
enable_swift: "no"
enable_ceilometer: "yes"
enable_collectd: "yes"
enable_zun: "yes"
enable_blazar: "yes"

Mark Goddard (mgoddard)
Changed in kolla:
status: New → Invalid
affects: kolla → kolla-ansible
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.