Kolla Ansible installation fails when the rabbitmq instance is not co-resident with the nova controller.

Bug #2020805 reported by Jay Rhine
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Undecided
Unassigned

Bug Description

** Summary **
The change introduced by https://review.opendev.org/c/openstack/kolla-ansible/+/817190 to fix https://bugs.launchpad.net/kolla-ansible/+bug/1946506 breaks Kolla Ansible installations where the rabbitmq instance is not co-resident with the nova controller.

Specifically, the ""{{ project_name }} | Ensure RabbitMQ vhosts exist" task in the "kolla-ansible/ansible/roles/service-rabbitmq/tasks/main.yml" file is delegated to the one of the rabbitmq hosts, but the node parameter is set to the host name of first node of the "nova_cell_conductor_group" group via the ansible_facts.hostname parameter. In the default multi-node configuration provided by kolla-ansible, both the rabbitmq and nova control hosts are the same servers, so this delegation will work correctly because the host name will be the same. However, for a more distributed configuration where the rabbitmq and nova hosts are not the same, the host name passed will be the name of the "original" server NOT the delegated server.

This will result in the same error 'Could not determine the version of the RabbitMQ server' as was seen in the original bug (https://bugs.launchpad.net/kolla-ansible/+bug/1946506) with debug logging enabled. My understanding of the original bug is that the ansible module responsible for adding rabbitmq user would sometimes default to the wrong node name, so by specifying the hostname via ansible_facts.hostname, this problem was eliminated. We have attempted to workaround this problem, by removing the one line introduced by 817190. This was successful in our environment. However, that would restore the originaly problem, so we have also tried a more robust alternative of pointing the node to the hostname for the delegated server. See below for the minor change.

--- /opt/kolla/venv/share/kolla-ansible/ansible/roles/service-rabbitmq/tasks/main.yml.orig 2023-05-18 19:04:13.120194010 +0000
+++ /opt/kolla/venv/share/kolla-ansible/ansible/roles/service-rabbitmq/tasks/main.yml 2023-05-25 16:54:40.514191582 +0000
@@ -20,7 +20,7 @@
         module_args:
           user: "{{ item.user }}"
           password: "{{ item.password }}"
- node: "rabbit@{{ ansible_facts.hostname }}"
+ node: "rabbit@{{ hostvars[service_rabbitmq_delegate_host]['ansible_facts']['hostname'] }}"
           update_password: always
           vhost: "{{ item.vhost }}"
           configure_priv: ".*"

Revision history for this message
Jay Rhine (fredomlover23) wrote :

Verified that this issue still exists in the latest master branch (antelope) as of 7/8/2023

Revision history for this message
Jay Jahns (jayjahns) wrote (last edit ):

Verified this is still on bobcat as of 6/10/2024.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)
Changed in kolla-ansible:
status: New → In Progress
Revision history for this message
Jay Jahns (jayjahns) wrote :

I have added a patch for this, as this is a deployment breaking issue. It prevents rabbitmq from being ran on nodes other than controllers.

We need to correct this asap.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/921707
Committed: https://opendev.org/openstack/kolla-ansible/commit/52729e6855f12dfee3d1d6f540858e4ca0af7402
Submitter: "Zuul (22348)"
Branch: master

commit 52729e6855f12dfee3d1d6f540858e4ca0af7402
Author: jayjahns <email address hidden>
Date: Mon Jun 10 15:30:01 2024 -0500

    Set node to a valid rabbitmq host

    If rabbitmq is not on the same host as the nova-controller,
    then this task will fail. This change ensures that the
    task references an actual rabbitmq host vs the host the
    task runs on.

    Closes-Bug: 2020805
    Change-Id: I1b58f4aeda8c9fe8db1770c63c17bf1c465f3d2a

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/921737

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/921738

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/921739

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.