MOS 9.0 installation - RabbitMQ hostname issue

Bug #1603782 reported by Matija Draganovic
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Oleksiy Molchanov
Mitaka
Fix Released
High
Oleksiy Molchanov

Bug Description

While testing the latest Mirantis release (9.0, Mitaka) I ran into some issues while doing a test environment deployment. After some investigation, I noticed something sucpisious about RabbitMQ behaviour.

After a couple of unsuccessfull deployments I noticed that rabbitmq-server keeps failing after the initial setup. I believe the reason is the misconfigured NODENAME parameter which doesn't use the system hostname correctly. Following are the steps that should show what's happening:

# Boot system from MOS9 ISO. At the Fuel menu set the following options:
DNS & Hostname >> Hostname >> "fuelsrv"
DNS & Hostname >> External DNS >> leave blank
DNS & Hostname >> Hostname to test DNS >> leave blank
Bootstrap Image >> Select "Skip buildinig bootstrap image"
Set Root Password >> Quit Setup >> Save and Continue + press "F8" afterwards

# The Fuel should install everything correctly. When the installation finishes, log into the system and verify hostname and RabbitMQ configuration:
[root@fuel ~]# hostname
fuel.domain.tld

[root@fuel ~]# hostnamectl
   Static hostname: fuelsrv
Transient hostname: fuel.domain.tld
         Icon name: computer-vm
           Chassis: vm
        Machine ID: 5e522aabe4bf4367a8ed779dcd1c21b6
           Boot ID: 187546a671094d7ea6433a608f0da8f5
    Virtualization: kvm
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-327.18.2.el7.x86_64
      Architecture: x86-64

[root@fuel ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.20.0.2 fuelsrv.domain.tld fuelsrv

[root@fuel ~]# cat /etc/rabbitmq/rabbitmq-env.conf
ERL_EPMD_ADDRESS=10.20.0.2
NODENAME=rabbit@fuel
NODE_IP_ADDRESS=10.20.0.2
NODE_PORT=5672
RABBITMQ_SERVER_ERL_ARGS=+K true +A30 +P 1048576

[root@fuel ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/rabbitmq-server.service.d
           └─limits.conf
   Active: active (running) since Sun 2016-07-17 11:30:34 UTC; 9min ago
 Main PID: 16598 (beam)
   Status: "Initialized"
   CGroup: /system.slice/rabbitmq-server.service
           ├─16598 /usr/lib64/erlang/erts-7.1/bin/beam -W w -A 64 -P 1048576 -K true -- -r...
           ├─16778 inet_gethost 4
           └─16779 inet_gethost 4

Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: RabbitMQ 3.6.1. Copyright (C) 200....
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: ## ## Licensed under the MP.../
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: ## ##
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: ########## Logs: /var/log/rabbit...g
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: ###### ## /var/log/rabbit...g
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: ##########
Jul 17 11:30:31 fuel.domain.tld rabbitmq-server[16598]: Starting broker...
Jul 17 11:30:34 fuel.domain.tld rabbitmq-server[16598]: systemd unit for activation check..."
Jul 17 11:30:34 fuel.domain.tld systemd[1]: Started RabbitMQ broker.
Jul 17 11:30:34 fuel.domain.tld rabbitmq-server[16598]: completed with 7 plugins.
Hint: Some lines were ellipsized, use -l to show in full.

# So far everything looks ok. Notice that the system hostname is currently at the default value ("fuel"). Now, reboot the system, log into back into it and check the forementioned configuration again:
[root@fuelsrv ~]# hostname
fuelsrv

[root@fuelsrv ~]# hostnamectl
   Static hostname: fuelsrv
         Icon name: computer-vm
           Chassis: vm
        Machine ID: 5e522aabe4bf4367a8ed779dcd1c21b6
           Boot ID: f32fc83db9b24d05bd4070da8f8cc423
    Virtualization: kvm
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-327.18.2.el7.x86_64
      Architecture: x86-64

[root@fuelsrv ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.20.0.2 fuelsrv.domain.tld fuelsrv

[root@fuelsrv ~]# cat /etc/rabbitmq/rabbitmq-env.conf
ERL_EPMD_ADDRESS=10.20.0.2
NODENAME=rabbit@fuel
NODE_IP_ADDRESS=10.20.0.2
NODE_PORT=5672
RABBITMQ_SERVER_ERL_ARGS=+K true +A30 +P 1048576

[root@fuelsrv ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: d isabled)
  Drop-In: /etc/systemd/system/rabbitmq-server.service.d
           └─limits.conf
   Active: failed (Result: start-limit) since Sun 2016-07-17 11:42:12 UTC; 31s ago
  Process: 2933 ExecStop=/usr/lib/rabbitmq/bin/rabbitmqctl stop (code=exited, status=69)
  Process: 2825 ExecStart=/usr/lib/rabbitmq/bin/rabbitmq-server (code=exited, status=1/FAILUR E)
 Main PID: 2825 (code=exited, status=1/FAILURE)
   Status: "Exited."

Jul 17 11:42:12 fuelsrv systemd[1]: rabbitmq-server.service scheduled restart job.
Jul 17 11:42:12 fuelsrv systemd[1]: rabbitmq-server.service changed auto-restart -> dead
Jul 17 11:42:12 fuelsrv systemd[1]: Job rabbitmq-server.service/restart finished, result=done
Jul 17 11:42:12 fuelsrv systemd[1]: Converting job rabbitmq-server.service/restart -> r...art
Jul 17 11:42:12 fuelsrv systemd[1]: start request repeated too quickly for rabbitmq-ser...ice
Jul 17 11:42:12 fuelsrv systemd[1]: rabbitmq-server.service changed dead -> failed
Jul 17 11:42:12 fuelsrv systemd[1]: Job rabbitmq-server.service/start finished, result=failed
Jul 17 11:42:12 fuelsrv systemd[1]: Failed to start RabbitMQ broker.
Jul 17 11:42:12 fuelsrv systemd[1]: Unit rabbitmq-server.service entered failed state.
Jul 17 11:42:12 fuelsrv systemd[1]: rabbitmq-server.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

# Notice that rabbitmq-server is now not working currently. Doing an service restart via systemctl doesn't help, but starting the service directly indicates what could be the problem:
[root@fuelsrv ~]# systemctl restart rabbitmq-server
Job for rabbitmq-server.service failed because start of the service was attempted too often. See "systemctl status rabbitmq-server.service" and "journalctl -xe" for details.
To force a start use "systemctl reset-failed rabbitmq-server.service" followed by "systemctl start rabbitmq-server.service" again.

[root@fuelsrv ~]# runuser -u rabbitmq /usr/lib/rabbitmq/bin/rabbitmq-server
ERROR: epmd error for host fuel: nxdomain (non-existing domain)

# If we take a look at the RabbitMQ configuration we can see that it's actually using the wrong hostname for it's NODENAME parameter (notice that the system hostname is now set to "fuelsrv", as set during the Fuel installation):
[root@fuelsrv ~]# cat /etc/rabbitmq/rabbitmq-env.conf
ERL_EPMD_ADDRESS=10.20.0.2
NODENAME=rabbit@fuel
NODE_IP_ADDRESS=10.20.0.2
NODE_PORT=5672
RABBITMQ_SERVER_ERL_ARGS=+K true +A30 +P 1048576

# After a simple change to the rabbitmq-env.conf the service should start sucessfully and work fine afterwards (and "survive" future reboots).
[root@fuelsrv ~]# cat /etc/rabbitmq/rabbitmq-env.conf
ERL_EPMD_ADDRESS=10.20.0.2
-NODENAME=rabbit@fuel
+NODENAME=rabbit@fuelsrv
NODE_IP_ADDRESS=10.20.0.2
NODE_PORT=5672
RABBITMQ_SERVER_ERL_ARGS=+K true +A30 +P 1048576

[root@fuelsrv ~]# runuser -u rabbitmq /usr/lib/rabbitmq/bin/rabbitmq-server

              RabbitMQ 3.6.1. Copyright (C) 2007-2016 Pivotal Software, Inc.
  ## ## Licensed under the MPL. See http://www.rabbitmq.com/
  ## ##
  ########## Logs: /<email address hidden>
  ###### ## /<email address hidden>
  ##########
              Starting broker...
 completed with 7 plugins.
^CStopping and halting node rabbit@fuelsrv ...
Gracefully halting Erlang VM

[root@fuelsrv ~]# grep -i fuelsrv /var/log/rabbitmq/rabbit@fuel*
/<email address hidden>:node : rabbit@fuelsrv
/<email address hidden>:log : /<email address hidden>
/<email address hidden>:sasl log : /<email address hidden>
/<email address hidden>:database dir : /var/lib/rabbitmq/mnesia/rabbit@fuelsrv
/<email address hidden>:Database directory at /var/lib/rabbitmq/mnesia/rabbit@fuelsrv is empty. Initialising from scratch...

[root@fuelsrv ~]# systemctl start rabbitmq-server
[root@fuelsrv ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: d isabled)
  Drop-In: /etc/systemd/system/rabbitmq-server.service.d
           └─limits.conf
   Active: active (running) since Sun 2016-07-17 11:43:52 UTC; 5s ago
  Process: 2933 ExecStop=/usr/lib/rabbitmq/bin/rabbitmqctl stop (code=exited, status=69)
 Main PID: 3260 (beam)
   Status: "Initialized"
   CGroup: /system.slice/rabbitmq-server.service
           ├─3260 /usr/lib64/erlang/erts-7.1/bin/beam -W w -A 64 -P 1048576 -K true -- -ro...
           ├─3446 inet_gethost 4
           └─3447 inet_gethost 4

Jul 17 11:43:52 fuelsrv systemd[1]: rabbitmq-server.service: Got notification message f...60)
Jul 17 11:43:52 fuelsrv systemd[1]: rabbitmq-server.service: got MAINPID=3260
Jul 17 11:43:52 fuelsrv systemd[1]: rabbitmq-server.service: got READY=1
Jul 17 11:43:52 fuelsrv systemd[1]: rabbitmq-server.service changed start -> running
Jul 17 11:43:52 fuelsrv systemd[1]: Job rabbitmq-server.service/start finished, result=done
Jul 17 11:43:52 fuelsrv systemd[1]: Started RabbitMQ broker.
Jul 17 11:43:52 fuelsrv systemd[1]: rabbitmq-server.service: got STATUS=Initialized
Jul 17 11:43:52 fuelsrv rabbitmq-server[3260]: Unexpected status from systemd "Showing o...et
Jul 17 11:43:52 fuelsrv rabbitmq-server[3260]: systemd READY notification failed, beware...ts
Jul 17 11:43:52 fuelsrv rabbitmq-server[3260]: completed with 7 plugins.
Hint: Some lines were ellipsized, use -l to show in full.

Of course, this doesn't happen if the default hostname is used for the installation. Could somebody please try to reproduce and (possibly) confirm the described behaviour?

Changed in mos:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
Dmitry Pyzhov (dpyzhov)
Changed in mos:
milestone: none → 10.0
no longer affects: fuel
Changed in fuel:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
milestone: none → 10.0
no longer affects: mos/9.x
no longer affects: mos
Changed in fuel:
importance: Undecided → High
status: New → Confirmed
Changed in fuel:
status: Confirmed → In Progress
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Oleksiy Molchanov (omolchanov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/344774

Dmitry Pyzhov (dpyzhov)
tags: added: 9.1-proposed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/346694

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/344774
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=3171ff6bfc40db3daee31f790f2a48a74d7cdbb0
Submitter: Jenkins
Branch: master

commit 3171ff6bfc40db3daee31f790f2a48a74d7cdbb0
Author: Oleksiy Molchanov <email address hidden>
Date: Wed Jul 20 12:37:04 2016 +0300

    Use hostname from astute for rabbitmq

    Closes-Bug: 1603782
    Change-Id: I3263beee759378db664d12531ec5bb45c0ef254b

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/346694
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=b8b901879153d7d68247085960d554a88bcd946f
Submitter: Jenkins
Branch: stable/mitaka

commit b8b901879153d7d68247085960d554a88bcd946f
Author: Oleksiy Molchanov <email address hidden>
Date: Wed Jul 20 12:37:04 2016 +0300

    Use hostname from astute for rabbitmq

    Closes-Bug: 1603782
    Change-Id: I3263beee759378db664d12531ec5bb45c0ef254b

tags: added: on-verification
Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :

Wasn't able to reproduce it on 9.1 snapshot #76

[root@nailgun ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 495
cat /etc/fuel_build_number:
 495
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-library9.0-9.0.0-1.mos8495.noarch
 rubygem-astute-9.0.0-1.mos753.noarch
 fuel-release-9.0.0-1.mos6349.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8748.noarch
 shotgun-9.0.0-1.mos90.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-openstack-metadata-9.0.0-1.mos8748.noarch
 python-packetary-9.0.0-1.mos142.noarch
 nailgun-mcagents-9.0.0-1.mos753.noarch
 fuel-utils-9.0.0-1.mos8495.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-misc-9.0.0-1.mos8495.noarch
 fuel-ostf-9.0.0-1.mos938.noarch
 fuel-notify-9.0.0-1.mos8495.noarch
 fuel-nailgun-9.0.0-1.mos8748.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-mirror-9.0.0-1.mos142.noarch
 fuel-migrate-9.0.0-1.mos8495.noarch

MOS_CENTOS_OS_MIRROR_ID: os-2016-06-23-135731
MOS_CENTOS_PROPOSED_MIRROR_ID: proposed-2016-07-29-200321
MOS_CENTOS_UPDATES_MIRROR_ID: updates-2016-06-23-135916
MOS_CENTOS_SECURITY_MIRROR_ID: security-2016-06-23-140002
MOS_CENTOS_HOLDBACK_MIRROR_ID: holdback-2016-06-23-140047
MOS_CENTOS_HOTFIX_MIRROR_ID: hotfix-2016-07-18-162958
MOS_UBUNTU_MIRROR_ID: 9.0-2016-07-29-200321
UBUNTU_MIRROR_ID: ubuntu-2016-07-30-170657
CENTOS_MIRROR_ID: centos-7.2.1511-2016-05-31-083834

tags: removed: on-verification
Revision history for this message
Mike Nguyen (moozoo) wrote :

Considering...

# fuel --version
9.0.0

The proposed fix in the bug description works to get rabbitmq up and running on startup properly.

But, it seems to bring another issue by the fact that nailgun can't seem to login to rabbitmq anymore...

As seen in /<email address hidden>:

=ERROR REPORT==== 6-Sep-2016::21:19:13 ===
Error on AMQP connection <0.5995.0> (10.100.204.5:54614 -> 10.100.204.5:5672, state: starting):
PLAIN login refused: user 'naily' - invalid credentials

The only way I found to get this fixed rapidly was to change the NODENAME in /etc/rabbitmq/rabbitmq-env.conf back to rabbit@fuel.

And then add "fuel" as one of the hostnames in /etc/hosts for the IP of the host:

10.100.204.5 hostname.sv.stw hostname fuel

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-library 10.0.0rc1

This issue was fixed in the openstack/fuel-library 10.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-library 10.0.0

This issue was fixed in the openstack/fuel-library 10.0.0 release.

Revision history for this message
Felipe Alfaro Solana (felipe-alfaro-gmail) wrote :

What about MOS9? We are hitting this bug and is quite annoying. In the end, we are not sure what to do. Shall we leave "fuel" as the hostname or add an entry to /etc/hosts?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.