Adding one compute host fails with undefined variable

Bug #2009834 reported by Stuart Grace
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
High
Unassigned

Bug Description

Following these instructions:
https://docs.openstack.org/openstack-ansible/latest/admin/scale-environment.html#add-a-compute-host

If OSA playbooks have not been used recently, the playbook os-nova-install.yml with --limit localhost,NEW_HOST_NAME fails at this task:

https://github.com/openstack/openstack-ansible/blob/17a37653e69282112eccc8416112f1253d7cf3d2/playbooks/os-nova-install.yml#L52

with this error:

"The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_local'
The error appears to be in '/opt/openstack-ansible/playbooks/os-nova-install.yml': line 52, column 7, but may be elsewhere in the file depending on the exact syntax problem."

I believe this is because the cached facts for the other Nova hosts not included in the --limit have expired and are reported as undefined.

Instead of checking groups['nova_all'], perhaps the check should be limited to the intersection of
groups['nova_all'] with ansible_play_hosts ?

Revision history for this message
Joseph Lenox (joseph-lenox) wrote :

I have been tripping over this issue with a Zed deployment for a week. Did you come across a workaround?

Revision history for this message
Joseph Lenox (joseph-lenox) wrote :
Revision history for this message
Joseph Lenox (joseph-lenox) wrote :

Here's what should be a patch to use intersect() in the filter list from the group list.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Hi Joseph.

I'm not sure this patch is correct one. Basically in the code that causes issue we're trying to identify if all nova services are running the same version before running database upgrades.

With the patch you're proposing it will detect all nova services running same version when just nova-api setup, when all computes are still running old version, as playbook is being imported multiple times for different set of hosts here https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/os-nova-install.yml

So fix here is way more complex then that.

Changed in openstack-ansible:
status: New → Triaged
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)
Changed in openstack-ansible:
status: Triaged → In Progress
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Just in case - when adding compute nodes, playbook is failing at the moment, when all setup of new compute is already done (at least for nova). So while it doesn't succeed, it executes all required steps for compute provisioning. So such errors can be also ignored.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/880150
Committed: https://opendev.org/openstack/openstack-ansible/commit/e26b56adfa0dda60a3c8595dd249d902d0ed0c03
Submitter: "Zuul (22348)"
Branch: master

commit e26b56adfa0dda60a3c8595dd249d902d0ed0c03
Author: Dmitriy Rabotyagov <email address hidden>
Date: Wed Apr 12 14:19:24 2023 +0200

    Stop gathering local software_versions for services

    With latest ansible-core playbooks started failing on adding extra compute
    or controller nodes, when cinder/nova playbooks run with limits.

    This happens as we're trying to reply on local facts for hosts that are
    expired. At the same time, it's not always possible to collect them, as some computes
    can be down while adding another one.

    With that we're simplifying flow and avoid old process of
    restarting services or executing migrations based on local facts.

    Depends-On: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/880147
    Depends-On: https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/880210
    Closes-Bug: #2009834
    Change-Id: I44dc8567e9a93f91327202de1bf88a067266711d

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 27.0.0.0rc1

This issue was fixed in the openstack/openstack-ansible 27.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.