RABBITMQ hostname is overwritten during minor update.

Bug #1954946 reported by Harry Kominos
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
New
Undecided
Unassigned

Bug Description

Description
===========
The following behaviour is observed during a minor update on Wallaby. When the original stack is deployed, the value of /home/stack/config-download/overcloud/group_vars/Controller RABBITMQ_NODENAME is rabbit@%{::hostname}

During the upgrade prepare the value of RABBITMQ_NODENAME changes to rabbit@%hiera{internal_fqdn_api} which is not the same. And the rabbit node fails to join the cluster. There is no other difference between the templates excluding the skip-nodes-and-networks which is removed (?).

Steps to reproduce
==================

- Deploy a wallaby cloud on baremetal nodes.
- Update the Undercloud (sudo -E tripleo-repos -b wallaby current-tripleo ?)
- Run the openstack overcloud upgrade prepare
At this point the config dir is created so the issue is present and can be seen
- (optional) Run the update_steps_playbook.yaml --limit overcloud-controller-1

Expected result
===============
Rabbit node joins the cluster and update finishes
=============
Update finishes but rabbit is Down. on that single node.

Environment
===========
Wallaby, OVS/ML2, External ceph

Logs & Configs
==============

openstack overcloud deploy --templates --skip-nodes-and-networks \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/environments/deployed-server-environment.yaml \ -e /home/stack/containers-prepare-parameter.yaml \ -e /home/stack/templates/overcloud-baremetal-deployed.yaml \ -e /home/stack/templates/networks-deployed-environment.yaml \ -e /home/stack/templates/vip-deployed-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /home/stack/templates/environments/neutron-ovs-dvr.yaml \ -e /home/stack/templates/environments/network-environment-OVS.yaml \ -e /home/stack/templates/environments/services/octavia.yaml \ -e /home/stack/templates/environments/cloudname.yaml \ -e /home/stack/templates/environments/predictable-placement/custom-domain.yaml \ -e /home/stack/templates/environments/external-ceph.yaml \ -e /home/stack/templates/novafixes.yaml \ -e /home/stack/templates/keystonefixes.yaml \ -e /home/stack/templates/horizonfixes.yaml \ -e /home/stack/templates/environments/ssl/enable-tls-newcerts.yaml \ -e /home/stack/templates/environments/ssl/tls-endpoints-public-dns.yaml \ -e /home/stack/templates/environments/ssl/inject-trust-anchor.yaml \ --timeout 1500 --disable-validations --stack overcloud --rm-heat --deployed-server

openstack-tripleo-heat-templates-14.3.1-0.20211202124412.13b1bcb.el8.noarch

Revision history for this message
Brendan Shephard (bshephar) wrote :

I guess this change caused the problem here:
https://github.com/openstack/tripleo-heat-templates/commit/ff3589786926992e0b822779f3c96b7d4e6c5cae#diff-fa55a8a6602d64088c427bdbf5748d4616a391221c80c4e7e3d3dabc3203d936

https://review.opendev.org/c/openstack/tripleo-heat-templates/+/813573

It should be a problem moving forward. But it appears it was introduced to fix an issue with SSL deployments. We might be able to add something to handle this better during the update process.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.