comment out 127.0.1.1 hostanme for rabbitmq

Bug #1960587 reported by Dwane Pottratz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Medium
Unassigned

Bug Description

*** edit: this happens in rabbitmq not galera ***

When doing an install of galera, it will fail if /etc/hosts contains 127.0.1.1 <hostname>

It would be nice if the galera playbook would check for and remove/comment out the line.

This is an issue with bare metal installs. Ubuntu 20.04 install from ISO adds the line to the /etc/hosts file.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Hi, Dwane

Can you kindly attach error you catch? As I never experienced any issues with this, and all my current deployments have that set.
So I'm not sure I understood what error you catched

Additionaly, python socket module along with some other software rely on that record, so I'm not sure about possible side effects of removing it for rest of software deployed on metal.

Changed in openstack-ansible:
status: New → Incomplete
Revision history for this message
Dwane Pottratz (dpcsar) wrote :
Download full text (8.0 KiB)

Hi Dmitriy,

The bug is not in galera. It is in rabbitmq. Not sure if you want me to open a new bug or modify this one.

I am working with stable/xena branch.

openstack_user_config.yml contains 'no_containers: true'
the bug doesn't reproduce in containers.

infra1 /etc/hosts file:
127.0.0.1 localhost
127.0.1.1 infra1

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
### BEGIN OPENSTACK-ANSIBLE MANAGED BLOCK ###
192.168.2.21 infra1.home infra1
192.168.2.22 infra2.home infra2
192.168.2.23 infra3.home infra3
192.168.2.20 haproxy0.home haproxy0
### END OPENSTACK-ANSIBLE MANAGED BLOCK ###
192.168.2.21 infra1 infra1
192.168.2.22 infra2 infra2
192.168.2.23 infra3 infra3

Here is the output from running the playbook rabbitmq-install.yml
TASK [rabbitmq_server : Ensure RabbitMQ node [0] is started] *******************
fatal: [infra1]: FAILED! => {"changed": false, "msg": "Unable to start service rabbitmq-server: Job for rabbitmq-server.service failed because the control process exited with error code.\nSee \"systemctl status rabbitmq-server.service\" and \"journalctl -xe\" for details.\n"}

Here is log file on the server:
-- The job identifier is 17034 and the job result is done.
Feb 11 07:42:43 infra1 systemd[1]: Starting RabbitMQ broker...
-- Subject: A start job for unit rabbitmq-server.service has begun execution
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A start job for unit rabbitmq-server.service has begun execution.
--
-- The job identifier is 17034.
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:43.688676-08:00 [erro] <0.130.0>
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:43.688676-08:00 [erro] <0.130.0> BOOT FAILED
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:43.688676-08:00 [erro] <0.130.0> ===========
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:43.688676-08:00 [erro] <0.130.0> ERROR: epmd error for host infra1: address (cannot connect to host/port)
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:43.688676-08:00 [erro] <0.130.0>
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: BOOT FAILED
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: ===========
Feb 11 07:42:43 infra1 rabbitmq-server[63163]: ERROR: epmd error for host infra1: address (cannot connect to host/port)
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:00 [erro] <0.130.0> supervisor: {local,rabbit_prelaunch_sup}
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:00 [erro] <0.130.0> errorContext: start_error
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:00 [erro] <0.130.0> reason: {epmd_error,"infra1",address}
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:00 [erro] <0.130.0> offender: [{pid,undefined},
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:00 [erro] <0.130.0> {id,prelaunch},
Feb 11 07:42:44 infra1 rabbitmq-server[63163]: 2022-02-11 07:42:44.693325-08:...

Read more...

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

This bug can be edited.

Oh, rabbitmq, yes, this one can break if hosts file is wrongly configured.

Having hosts outside of the block is not expected indeed. And I can confirm seeing same picture in my sandbox.
In my LXC sandbox I see these extra lines only for rabbitmq container though (and no record for 127.0.1.1). So I supposed it's somehow related or some fix/workaround is present in role that's misbehaving.

Anyway I can confirm that it's valid issue.

Changed in openstack-ansible:
status: Incomplete → Triaged
importance: Undecided → Medium
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :
Dwane Pottratz (dpcsar)
summary: - comment out 127.0.1.1 hostanme for galera
+ comment out 127.0.1.1 hostanme for rabbitmq
description: updated
Revision history for this message
Dwane Pottratz (dpcsar) wrote :

Yeah, that does look like the culprit for the hosts issue.

I don't think that you will see 127.0.1.1 lines in then containers. I didn't. You will not see them in the upstream ubuntu cloud images either. They seem to only be put in when you do an install from the ubuntu iso.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote (last edit ):

Can I kindly ask you to test out current "master" for rabbitmq_server role? We have landed bunch of fixes there during last weeks, and it seems that they solve issue with requirement to clean up hosts file (it should not be required anymore)
That would be helpful to understand if it also fixes issue for you so we could backport solution to prior branches.

After you pulled in latest version from git and switched to master branch, you would need to run `openstack-ansible playbooks/rabbitmq-server.yml -e rabbitmq_upgrade=true`

For another issue regarding duplicated records I will push fix soon.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-rabbitmq_server (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-rabbitmq_server (stable/xena)
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Or alternatively, could you verify that https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/828932 work for you on Xena?

Thanks in advance.

Revision history for this message
Jonathan Rosser (jrosser) wrote :
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

That goes to mapping of Erlang 25672 port and we used really a hack rather then documented option to bind it to management ip. And what goes from my testing is that all tools suppose this port to be accessible through hostname.

Yes, other way would be to disallow using such hosts entries, and yes we don't have them in containers, but tbh I don't see why we need 25672 to be binded to management ip only. Unless there's another rabbit cluster or some other Erlang app on same host, but should be binded to different ip.

I eventually had patch to drop such record in decent way, but that results easily in fqdn change, which can bring other problems.

Revision history for this message
Dwane Pottratz (dpcsar) wrote (last edit ):

Hi Dmitriy,

Master branch will install rabbitmq with the 127.0.1.1 in the /etc/hosts file. (SHA 1a706fe95058c5b0a28c1604bb2ac6d58c82acaf)

The extra line in the host file are still there:
cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 infradev1

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
### BEGIN OPENSTACK-ANSIBLE MANAGED BLOCK ###
192.168.12.21 infradev1.home infradev1
192.168.12.22 infradev2.home infradev2
192.168.12.23 infradev3.home infradev3
192.168.12.20 haproxydev0.home haproxy0 haproxydev0
### END OPENSTACK-ANSIBLE MANAGED BLOCK ###
192.168.12.21 infradev1 infradev1
192.168.12.22 infradev2 infradev2
192.168.12.23 infradev3 infradev3

I am not sure if it is rabbit that is putting the extra lines in there. I don't know where they are coming from.

I ran the following:
git clone -b master https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible
rsync -rltv --delete /opt/openstack-ansible/etc/openstack_deploy/ /etc/openstack_deploy/
cp openstack_user_config.yml /etc/openstack_deploy/
cp user_variables.yml /etc/openstack_deploy/
cp user_secrets.yml /etc/openstack_deploy/
/opt/openstack-ansible/scripts/bootstrap-ansible.sh
cd /opt/openstack-ansible/playbooks
openstack-ansible setup-hosts.yml
openstack-ansible repo-install.yml
openstack-ansible haproxy-install.yml
openstack-ansible memcached-install.yml
openstack-ansible galera-install.yml
openstack-ansible rabbitmq-install.yml

I believe that order is following the setup-infrastructure.yml for the components I have in my inventory.

Hope that helps.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Extra lines should be covered with https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/828929

Regarding `127.0.1.1` record - my opinion it should not hurt in case if 25672 is binded to 0.0.0.0 which should be the case on master.
So question is more - do you still catch the error mentioned before, or it just works now with 127.0.1.1 in /etc/hosts?

Revision history for this message
Dwane Pottratz (dpcsar) wrote :

It now works with 127.0.1.1 records in the /etc/hosts file.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on openstack-ansible-rabbitmq_server (stable/xena)

Change abandoned by "Dmitriy Rabotyagov <email address hidden>" on branch: stable/xena
Review: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/828932

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-rabbitmq_server (master)
Changed in openstack-ansible:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/openstack-ansible/+/830175

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-rabbitmq_server (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/828929
Committed: https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/c8ce051651255c7168381e6861be5371d27bbedb
Submitter: "Zuul (22348)"
Branch: master

commit c8ce051651255c7168381e6861be5371d27bbedb
Author: Dmitriy Rabotyagov <email address hidden>
Date: Sat Feb 12 00:22:33 2022 +0200

    Verify if hosts file already managed with OSA

    With this patch we ensure that duplicated records are not
    created with rabbitmq role if hosts file already contain
    OSA managed block. Managing hosts still might be required for
    role usage outside of the OSA so we workaround this usecase.

    Change-Id: Ia20902f0ffe21ce563966fee4d233e5ec3afe3d9
    Related-Bug: #1960587

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-rabbitmq_server (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/830172
Committed: https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/e6b3ddbc1e8f1c558f4dcb615f5d5d3a96348584
Submitter: "Zuul (22348)"
Branch: master

commit e6b3ddbc1e8f1c558f4dcb615f5d5d3a96348584
Author: Dmitriy Rabotyagov <email address hidden>
Date: Mon Feb 21 14:58:33 2022 +0100

    Remove affecting rabbitmq hosts record

    Record for 127.0.1.1 is added by some distributions which makes
    clustering fail, as Erlang port is binded to mgmt IP, while rabbit
    expects to access it through $hostname, which would lead to 127.0.1.1.
    At same time it's possbile to explicitly bind distribution port only
    to single address. So we need to
    ensure that hostname resolvs to mgmt IP and drop record for 127.0.1.1

    Closes-Bug: #1960587
    Change-Id: I907d4714319ac7134ede0dc62b51c1964b9befc5

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-rabbitmq_server (stable/xena)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-rabbitmq_server (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/836167

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-rabbitmq_server (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/836168

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-rabbitmq_server (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/836167
Committed: https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/282f56b573de0c7eead53636545bdbe167c264ee
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 282f56b573de0c7eead53636545bdbe167c264ee
Author: Dmitriy Rabotyagov <email address hidden>
Date: Sat Feb 12 00:22:33 2022 +0200

    Verify if hosts file already managed with OSA

    With this patch we ensure that duplicated records are not
    created with rabbitmq role if hosts file already contain
    OSA managed block. Managing hosts still might be required for
    role usage outside of the OSA so we workaround this usecase.

    Change-Id: Ia20902f0ffe21ce563966fee4d233e5ec3afe3d9
    Related-Bug: #1960587
    (cherry picked from commit c8ce051651255c7168381e6861be5371d27bbedb)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/830175
Committed: https://opendev.org/openstack/openstack-ansible/commit/424887242daa106ae5f814ef71ee6cadc5b65677
Submitter: "Zuul (22348)"
Branch: master

commit 424887242daa106ae5f814ef71ee6cadc5b65677
Author: Dmitriy Rabotyagov <email address hidden>
Date: Mon Feb 21 15:09:46 2022 +0100

    [doc] Add note regarding 127.0.1.1 removal

    Related-Bug: #1960587
    Change-Id: I61a27049313044ed211483a7273308afcb2d861f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-rabbitmq_server (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/836166
Committed: https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/f3de91c2549ec9306a4cdb1231fffb036a322295
Submitter: "Zuul (22348)"
Branch: stable/xena

commit f3de91c2549ec9306a4cdb1231fffb036a322295
Author: Dmitriy Rabotyagov <email address hidden>
Date: Sat Feb 12 00:22:33 2022 +0200

    Verify if hosts file already managed with OSA

    With this patch we ensure that duplicated records are not
    created with rabbitmq role if hosts file already contain
    OSA managed block. Managing hosts still might be required for
    role usage outside of the OSA so we workaround this usecase.

    Change-Id: Ia20902f0ffe21ce563966fee4d233e5ec3afe3d9
    Related-Bug: #1960587
    (cherry picked from commit c8ce051651255c7168381e6861be5371d27bbedb)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-rabbitmq_server (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/836168
Committed: https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/342174ede5616994f2b2dc4f87492d058d7e6b4e
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 342174ede5616994f2b2dc4f87492d058d7e6b4e
Author: Dmitriy Rabotyagov <email address hidden>
Date: Sat Feb 12 00:22:33 2022 +0200

    Verify if hosts file already managed with OSA

    With this patch we ensure that duplicated records are not
    created with rabbitmq role if hosts file already contain
    OSA managed block. Managing hosts still might be required for
    role usage outside of the OSA so we workaround this usecase.

    Depends-On: https://review.opendev.org/c/openstack/openstack-ansible-tests/+/836335
    Change-Id: Ia20902f0ffe21ce563966fee4d233e5ec3afe3d9
    Related-Bug: #1960587
    (cherry picked from commit c8ce051651255c7168381e6861be5371d27bbedb)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-rabbitmq_server yoga-eom

This issue was fixed in the openstack/openstack-ansible-rabbitmq_server yoga-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.