"Found multiple compute-2, which is unexpected" if any host shortname starts with another host shortname

Bug #1856193 reported by James Slagle
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Kevin Carter

Bug Description

When config download runs to generate the ansible project directory, it will fail if any host shortname starts with another host's shortname.

Particularly, this will happen when deploying with more than 10 nodes of the same roles and using the default hostname formats since a hostname such as openstack-compute-10 starts with openstack-compute-1.

Sample traceback of a host called compute-2, and I also have hosts compute-20, compute-21, etc:

Traceback (most recent call last):
  File "/usr/bin/tripleo-config-download", line 57, in <module>
    stack_config.download_config(args.stack_name, args.output_dir)
  File "/home/centos/tripleo-common/tripleo_common/utils/config.py", line 577, in download_config
    self.write_config(stack, name, config_dir, config_type)
  File "/home/centos/tripleo-common/tripleo_common/utils/config.py", line 515, in write_config
    if role_host_vars and server_role:
SystemError: Found multiple `compute-2`, which is unexpected. This means that the FQDN of the selected device is either wrong or is sharing a name with another host, which is also wrong. Please correct this issue before continuing. Return data can be found here -> [{u'fqdn_internal_api': u'compute-2.internalapi.localdomain', u'fqdn_ctlplane': u'compute-2.ctlplane.localdomain', u'fqdn_storage': u'compute-2.storage.localdomain', u'fqdn_canonical': u'compute-2.localdomain', u'fqdn_tenant': u'compute-2.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-20.internalapi.localdomain', u'fqdn_ctlplane': u'compute-20.ctlplane.localdomain', u'fqdn_storage': u'compute-20.storage.localdomain', u'fqdn_canonical': u'compute-20.localdomain', u'fqdn_tenant': u'compute-20.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-21.internalapi.localdomain', u'fqdn_ctlplane': u'compute-21.ctlplane.localdomain', u'fqdn_storage': u'compute-21.storage.localdomain', u'fqdn_canonical': u'compute-21.localdomain', u'fqdn_tenant': u'compute-21.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-22.internalapi.localdomain', u'fqdn_ctlplane': u'compute-22.ctlplane.localdomain', u'fqdn_storage': u'compute-22.storage.localdomain', u'fqdn_canonical': u'compute-22.localdomain', u'fqdn_tenant': u'compute-22.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-23.internalapi.localdomain', u'fqdn_ctlplane': u'compute-23.ctlplane.localdomain', u'fqdn_storage': u'compute-23.storage.localdomain', u'fqdn_canonical': u'compute-23.localdomain', u'fqdn_tenant': u'compute-23.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-24.internalapi.localdomain', u'fqdn_ctlplane': u'compute-24.ctlplane.localdomain', u'fqdn_storage': u'compute-24.storage.localdomain', u'fqdn_canonical': u'compute-24.localdomain', u'fqdn_tenant': u'compute-24.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-25.internalapi.localdomain', u'fqdn_ctlplane': u'compute-25.ctlplane.localdomain', u'fqdn_storage': u'compute-25.storage.localdomain', u'fqdn_canonical': u'compute-25.localdomain', u'fqdn_tenant': u'compute-25.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-26.internalapi.localdomain', u'fqdn_ctlplane': u'compute-26.ctlplane.localdomain', u'fqdn_storage': u'compute-26.storage.localdomain', u'fqdn_canonical': u'compute-26.localdomain', u'fqdn_tenant': u'compute-26.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-27.internalapi.localdomain', u'fqdn_ctlplane': u'compute-27.ctlplane.localdomain', u'fqdn_storage': u'compute-27.storage.localdomain', u'fqdn_canonical': u'compute-27.localdomain', u'fqdn_tenant': u'compute-27.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-28.internalapi.localdomain', u'fqdn_ctlplane': u'compute-28.ctlplane.localdomain', u'fqdn_storage': u'compute-28.storage.localdomain', u'fqdn_canonical': u'compute-28.localdomain', u'fqdn_tenant': u'compute-28.tenant.localdomain'}, {u'fqdn_internal_api': u'compute-29.internalapi.localdomain', u'fqdn_ctlplane': u'compute-29.ctlplane.localdomain', u'fqdn_storage': u'compute-29.storage.localdomain', u'fqdn_canonical': u'compute-29.localdomain', u'fqdn_tenant': u'compute-29.tenant.localdomain'}].

The issue is caused by this patch:
https://review.opendev.org/#/c/695998/

These lines:
            if role_host_vars and server_role:
                servers_ansible_host_vars = [
                    v for k, v in server_role_vars.items()
                    if k.startswith(server)
                ]
                if len(servers_ansible_host_vars) > 1:
                    raise SystemError(
                        "Found multiple `{}`, which is unexpected. This means"
                        " that the FQDN of the selected device is either"
                        " wrong or is sharing a name with another host, which"
                        " is also wrong. Please correct this issue before"
                        " continuing. Return data can be found here"
                        " -> {}.".format(
                            server,
                            servers_ansible_host_vars
                        )
                    )
ine:

Changed in tripleo:
status: New → In Progress
importance: Undecided → High
milestone: none → ussuri-2
assignee: nobody → Kevin Carter (kevin-carter)
Changed in tripleo:
assignee: Kevin Carter (kevin-carter) → James Slagle (james-slagle)
Revision history for this message
James Slagle (james-slagle) wrote :
Changed in tripleo:
milestone: ussuri-2 → ussuri-1
assignee: James Slagle (james-slagle) → Kevin Carter (kevin-carter)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/698726
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=458d1add652fd45ef1b69c61b9eed89c0be9413e
Submitter: Zuul
Branch: master

commit 458d1add652fd45ef1b69c61b9eed89c0be9413e
Author: Kevin Carter (cloudnull) <email address hidden>
Date: Thu Dec 12 14:18:26 2019 +0000

    Revert "Update hostvars lookup to fix regression"

    This reverts commit d8679da62e38d02cd0dca7248a2506793e954e18.

    Closes-Bug: #1856193
    Change-Id: Iea1d99ff6f613f65b1ade842a694fc44f924a7c6

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/train)

Reviewed: https://review.opendev.org/698727
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=106ca91ed0f13c3e1e8fedcbb9e690a3dff47afe
Submitter: Zuul
Branch: stable/train

commit 106ca91ed0f13c3e1e8fedcbb9e690a3dff47afe
Author: Kevin Carter (cloudnull) <email address hidden>
Date: Thu Dec 12 14:21:44 2019 +0000

    Revert "Update hostvars lookup to fix regression"

    This reverts commit 6116b23d580d607187efc740dbbe49ce1b48a9b8.

    Closes-Bug: #1856193
    Change-Id: I851b5f5273ca7fdc54a3a256308b3ce73c57a38c

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 11.3.2

This issue was fixed in the openstack/tripleo-common 11.3.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 12.1.0

This issue was fixed in the openstack/tripleo-common 12.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.