[Rocky to Stein] Upgrade run fails with /usr/bin/python: No such file or directory

Bug #1856313 reported by Jose Luis Franco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Jose Luis Franco

Bug Description

When running the overcloud upgrade run operation for the first controller, it is failing with the following error:

2019-12-12 17:14:59 | PLAY [Gather facts from undercloud] ********************************************
2019-12-12 17:14:59 | skipping: no hosts matched
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | PLAY [Gather facts from overcloud] *********************************************
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | PLAY [Load global variables] ***************************************************
2019-12-12 17:14:59 | Thursday 12 December 2019 17:14:58 -0500 (0:00:00.132) 0:00:00.132 *****
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | TASK [include_vars] ************************************************************
2019-12-12 17:14:59 | ok: [controller-0] => {"ansible_facts": {"deploy_steps_max": 6, "ssh_known_hosts": {"compute-0": "[172.17.1.30]*,[compute-0.redhat.local]*,[compute-0]*,[172.17.3.19]*,[compute-0.storage.redhat.local]*,[compute-0.storage]*,[172.17.1.30]*,[compute-0.internalapi.redhat.local]*,[compute-0.internalapi]*,[172.17.2.10]*,[compute-0.tenant.redhat.local]*,[compute-0.tenant]*,[192.168.24.6]*,[compute-0.ctlplane.redhat.local]*,[compute-0.ctlplane]*", "compute-1": "[172.17.1.23]*,[compute-1.redhat.local]*,[compute-1]*,[172.17.3.34]*,[compute-1.storage.redhat.local]*,[compute-1.storage]*,[172.17.1.23]*,[compute-1.internalapi.redhat.local]*,[compute-1.internalapi]*,[172.17.2.29]*,[compute-1.tenant.redhat.local]*,[compute-1.tenant]*,[192.168.24.13]*,[compute-1.ctlplane.redhat.local]*,[compute-1.ctlplane]*", "controller-0": "[172.17.1.31]*,[controller-0.redhat.local]*,[controller-0]*,[172.17.3.22]*,[controller-0.storage.redhat.local]*,[controller-0.storage]*,[172.17.4.26]*,[controller-0.storagemgmt.redhat.local]*,[controller-0.storagemgmt]*,[172.17.1.31]*,[controller-0.internalapi.redhat.local]*,[controller-0.internalapi]*,[172.17.2.18]*,[controller-0.tenant.redhat.local]*,[controller-0.tenant]*,[10.0.0.110]*,[controller-0.external.redhat.local]*,[controller-0.external]*,[192.168.24.24]*,[controller-0.ctlplane.redhat.local]*,[controller-0.ctlplane]*", "controller-1": "[172.17.1.14]*,[controller-1.redhat.local]*,[controller-1]*,[172.17.3.24]*,[controller-1.storage.redhat.local]*,[controller-1.storage]*,[172.17.4.10]*,[controller-1.storagemgmt.redhat.local]*,[controller-1.storagemgmt]*,[172.17.1.14]*,[controller-1.internalapi.redhat.local]*,[controller-1.internalapi]*,[172.17.2.25]*,[controller-1.tenant.redhat.local]*,[controller-1.tenant]*,[10.0.0.106]*,[controller-1.external.redhat.local]*,[controller-1.external]*,[192.168.24.10]*,[controller-1.ctlplane.redhat.local]*,[controller-1.ctlplane]*", "controller-2": "[172.17.1.17]*,[controller-2.redhat.local]*,[controller-2]*,[172.17.3.28]*,[controller-2.storage.redhat.local]*,[controller-2.storage]*,[172.17.4.32]*,[controller-2.storagemgmt.redhat.local]*,[controller-2.storagemgmt]*,[172.17.1.17]*,[controller-2.internalapi.redhat.local]*,[controller-2.internalapi]*,[172.17.2.13]*,[controller-2.tenant.redhat.local]*,[controller-2.tenant]*,[10.0.0.104]*,[controller-2.external.redhat.local]*,[controller-2.external]*,[192.168.24.17]*,[controller-2.ctlplane.redhat.local]*,[controller-2.ctlplane]*"}}, "ansible_included_var_files": ["/var/lib/mistral/607d277d-df56-40ef-95bc-5bc7f46dac5f/global_vars.yaml"], "changed": false}
2019-12-12 17:14:59 | Thursday 12 December 2019 17:14:58 -0500 (0:00:00.079) 0:00:00.211 *****
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | TASK [ensure we get the right selinux context] *********************************
2019-12-12 17:14:59 | fatal: [controller-0]: FAILED! => {"changed": false, "module_stderr": "Warning: Permanently added '192.168.24.24' (ECDSA) to the list of known hosts.\r\n/bin/sh: /usr/bin/python: No such file or directory\n", "module_stdout": "", "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error", "rc": 127}
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | PLAY RECAP *********************************************************************
2019-12-12 17:14:59 | controller-0 : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
2019-12-12 17:14:59 |
2019-12-12 17:14:59 | Thursday 12 December 2019 17:14:59 -0500 (0:00:00.446) 0:00:00.658 *****
2019-12-12 17:14:59 | ===============================================================================

When checking the overcloud-0 facts cache, we can see that it hasn't been updated and the ansible_python binary is still pointing to /usr/bin/python which it isn't present at that point in time as we have upgraded the controller from RHEL7 to RHEL8.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698891

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/698894

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/699411

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/699439

Changed in tripleo:
milestone: ussuri-1 → ussuri-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/699411
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=6b4f47dd4245133294e459d9a7d0c38fc9af21a6
Submitter: Zuul
Branch: master

commit 6b4f47dd4245133294e459d9a7d0c38fc9af21a6
Author: Jose Luis Franco Arza <email address hidden>
Date: Tue Dec 17 14:18:51 2019 +0100

    Change the python interpreter discovery mode.

    Current default mode for the python interpreter discover inansible 2.8 is
    auto_legacy. This patch changes the mode to auto, the biggest difference
    respecting auto_legacy is 'If no entry is found, or the listed Python is
    not present on the target host, searches a list of common Python interpreter
    paths and uses the first one found' [0].
    Currently, it has been observed some issues with the
    discovered_python_interpreter fact not getting updated on specific scenarios
    (for example, when upgrading the node from RHEL7 to RHEL8). This change is
    expected to improve this situation.

    [0] - https://docs.ansible.com/ansible/latest/reference_appendices/interpreter_discovery.html#
    Related-Bug: #1856313
    Change-Id: Iaef4839bb15ec398537b5c57a441c8e28a552bc0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/701520

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/train)

Reviewed: https://review.opendev.org/701520
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=62d24f121efc40835581e0e0a48e82e0a6f392fb
Submitter: Zuul
Branch: stable/train

commit 62d24f121efc40835581e0e0a48e82e0a6f392fb
Author: Jose Luis Franco Arza <email address hidden>
Date: Tue Dec 17 14:18:51 2019 +0100

    Change the python interpreter discovery mode.

    Current default mode for the python interpreter discover inansible 2.8 is
    auto_legacy. This patch changes the mode to auto, the biggest difference
    respecting auto_legacy is 'If no entry is found, or the listed Python is
    not present on the target host, searches a list of common Python interpreter
    paths and uses the first one found' [0].
    Currently, it has been observed some issues with the
    discovered_python_interpreter fact not getting updated on specific scenarios
    (for example, when upgrading the node from RHEL7 to RHEL8). This change is
    expected to improve this situation.

    [0] - https://docs.ansible.com/ansible/latest/reference_appendices/interpreter_discovery.html#
    Related-Bug: #1856313
    Change-Id: Iaef4839bb15ec398537b5c57a441c8e28a552bc0
    (cherry picked from commit 6b4f47dd4245133294e459d9a7d0c38fc9af21a6)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/stein)

Reviewed: https://review.opendev.org/699439
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=0e559fc7cba7f9c3eff7c36349790d90bb761960
Submitter: Zuul
Branch: stable/stein

commit 0e559fc7cba7f9c3eff7c36349790d90bb761960
Author: Jose Luis Franco Arza <email address hidden>
Date: Tue Dec 17 14:18:51 2019 +0100

    Change the python interpreter discovery mode.

    Current default mode for the python interpreter discover inansible 2.8 is
    auto_legacy. This patch changes the mode to auto, the biggest difference
    respecting auto_legacy is 'If no entry is found, or the listed Python is
    not present on the target host, searches a list of common Python interpreter
    paths and uses the first one found' [0].
    Currently, it has been observed some issues with the
    discovered_python_interpreter fact not getting updated on specific scenarios
    (for example, when upgrading the node from RHEL7 to RHEL8). This change is
    expected to improve this situation.

    [0] - https://docs.ansible.com/ansible/latest/reference_appendices/interpreter_discovery.html#
    Related-Bug: #1856313
    Change-Id: Iaef4839bb15ec398537b5c57a441c8e28a552bc0
    (cherry picked from commit 6b4f47dd4245133294e459d9a7d0c38fc9af21a6)
    (cherry picked from commit 62d24f121efc40835581e0e0a48e82e0a6f392fb)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/698891
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=14db20baee3529b223a87e9f73279c6034a47427
Submitter: Zuul
Branch: master

commit 14db20baee3529b223a87e9f73279c6034a47427
Author: Jose Luis Franco Arza <email address hidden>
Date: Fri Dec 13 13:39:17 2019 +0100

    Force facts cache refreshing before upgrade.

    When upgrading from Rocky to Stein, an upgrade of the operating system is
    performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
    default /usr/bin/python binary. As the facts cache is enabled, Ansible's
    strategy does not consider to upgrade facts and therefore we try to run the
    ansible playbook using the old python binary when running the upgrade.
    This fails with the error: /usr/bin/python: No such file or directory.

    This patch makes use of the setup task in combination with gather_facts
    false, to ensure that the facts are gathered and refreshed for the
    Overcloud nodes. This way, we make sure that we are using the right
    python binary. As during scale, a similar situation is occuring, this
    patch adds the same logic in scale_playbook.

    Closes-Bug: #1856313
    Change-Id: I87974e88c38b42e90bc3cd801fcf1deaf268720c

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/704138

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/704138
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f632ea38ac1f5c68d0659d5a8d0fb51bf7844eb4
Submitter: Zuul
Branch: stable/train

commit f632ea38ac1f5c68d0659d5a8d0fb51bf7844eb4
Author: Jose Luis Franco Arza <email address hidden>
Date: Fri Dec 13 13:39:17 2019 +0100

    Force facts cache refreshing before upgrade.

    When upgrading from Rocky to Stein, an upgrade of the operating system is
    performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
    default /usr/bin/python binary. As the facts cache is enabled, Ansible's
    strategy does not consider to upgrade facts and therefore we try to run the
    ansible playbook using the old python binary when running the upgrade.
    This fails with the error: /usr/bin/python: No such file or directory.

    This patch makes use of the setup task in combination with gather_facts
    false, to ensure that the facts are gathered and refreshed for the
    Overcloud nodes. This way, we make sure that we are using the right
    python binary. As during scale, a similar situation is occuring, this
    patch adds the same logic in scale_playbook.

    Closes-Bug: #1856313
    Change-Id: I87974e88c38b42e90bc3cd801fcf1deaf268720c
    (cherry picked from commit 14db20baee3529b223a87e9f73279c6034a47427)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/704947

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/704947
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=598cf6977fcc37b1b4bcc1463b6c7e80f7037b5e
Submitter: Zuul
Branch: master

commit 598cf6977fcc37b1b4bcc1463b6c7e80f7037b5e
Author: Jose Luis Franco Arza <email address hidden>
Date: Thu Jan 30 10:54:58 2020 +0100

    Force facts cache refreshing after OS upgrade.

    When upgrading from Rocky to Stein, an upgrade of the operating system is
    performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
    default /usr/bin/python binary. As the facts cache is enabled, Ansible's
    strategy does not consider to upgrade facts and therefore we try to run the
    ansible playbook using the old python binary when running the upgrade.
    This fails with the error: /usr/bin/python: No such file or directory.

    This patch makes use of the setup task in combination with clear_facts
    right after rebooting to upgrade the operating system, to ensure that
    the facts are gathered and refreshed for the Overcloud node just upgraded.
    This way, we make sure that we are using the right python binary.

    Closes-Bug: #1856313
    Change-Id: Ia1fa60c22e482ab14a509730cf93634772e077a7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.1.0

This issue was fixed in the openstack/tripleo-heat-templates 12.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/708343

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/708343
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=0cd97e44cd4027805b74240a04a6475d5f2eaaa2
Submitter: Zuul
Branch: stable/train

commit 0cd97e44cd4027805b74240a04a6475d5f2eaaa2
Author: Jose Luis Franco Arza <email address hidden>
Date: Thu Jan 30 10:54:58 2020 +0100

    Force facts cache refreshing after OS upgrade.

    When upgrading from Rocky to Stein, an upgrade of the operating system is
    performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
    default /usr/bin/python binary. As the facts cache is enabled, Ansible's
    strategy does not consider to upgrade facts and therefore we try to run the
    ansible playbook using the old python binary when running the upgrade.
    This fails with the error: /usr/bin/python: No such file or directory.

    This patch makes use of the setup task in combination with clear_facts
    right after rebooting to upgrade the operating system, to ensure that
    the facts are gathered and refreshed for the Overcloud node just upgraded.
    This way, we make sure that we are using the right python binary.

    Closes-Bug: #1856313
    Change-Id: Ia1fa60c22e482ab14a509730cf93634772e077a7
    (cherry picked from commit 598cf6977fcc37b1b4bcc1463b6c7e80f7037b5e)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/708863

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (stable/stein)

Change abandoned by Jose Luis Franco (<email address hidden>) on branch: stable/stein
Review: https://review.opendev.org/698894
Reason: Abandon in favor of https://review.opendev.org/#/c/708863/1

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/708863
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b873e1e620f702a0a779417e683f8e1972cdceee
Submitter: Zuul
Branch: stable/stein

commit b873e1e620f702a0a779417e683f8e1972cdceee
Author: Jose Luis Franco Arza <email address hidden>
Date: Thu Jan 30 10:54:58 2020 +0100

    Force facts cache refreshing after OS upgrade.

    When upgrading from Rocky to Stein, an upgrade of the operating system is
    performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
    default /usr/bin/python binary. As the facts cache is enabled, Ansible's
    strategy does not consider to upgrade facts and therefore we try to run the
    ansible playbook using the old python binary when running the upgrade.
    This fails with the error: /usr/bin/python: No such file or directory.

    This patch makes use of the setup task in combination with clear_facts
    right after rebooting to upgrade the operating system, to ensure that
    the facts are gathered and refreshed for the Overcloud node just upgraded.
    This way, we make sure that we are using the right python binary.

    Closes-Bug: #1856313
    Change-Id: Ia1fa60c22e482ab14a509730cf93634772e077a7
    (cherry picked from commit 598cf6977fcc37b1b4bcc1463b6c7e80f7037b5e)
    (cherry picked from commit 0cd97e44cd4027805b74240a04a6475d5f2eaaa2)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.2.0

This issue was fixed in the openstack/tripleo-heat-templates 12.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates stein-eol

This issue was fixed in the openstack/tripleo-heat-templates stein-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.