Overcloud deployment continues without external tasks if undercloud is "unreachable"

Bug #1960518 reported by Cédric Jeanneret
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Cédric Jeanneret

Bug Description

First reported on Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2049393

Description of problem:

When there is an issue with sshing undercloud from undercloud, ansible playbook start ignoring tasks on undercloud.
Because we run external deploy tasks from undercloud, this results in incomplete settings. Actually in our case deployment failed at starting containers in step 4, because tasks to create keystone resources are not invoked.

Version-Release number of selected component (if applicable):
16.2.1

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:
Deployment fails at early stage because of unreachable undercloud

Expected results:
Deployment continues with a error during configurations, which don't look related to undercloud unreachability.

Additional info:

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-common/+/828721

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/828721
Committed: https://opendev.org/openstack/tripleo-common/commit/80b294ba884ca8ae67137a9885e02edef0b08d4e
Submitter: "Zuul (22348)"
Branch: master

commit 80b294ba884ca8ae67137a9885e02edef0b08d4e
Author: Cédric Jeanneret <email address hidden>
Date: Thu Feb 10 15:38:05 2022 +0100

    Ensure failures on the undercloud leads to a complete stop

    With tripleo_free strategy, we might hit situation where the undercloud
    is unreachable for some reasons, preventing external tasks to happen on
    the overcloud nodes.

    This is especially true for older releases, such as train, where
    mistral was still used in order to orchestrate the different playbooks
    and runs.

    Change-Id: I278fdc9597f83f1dc8390569be9716d3c8847dc4
    Closes-Bug: #1960518

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-common/+/829201

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/tripleo-common/+/829205

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/829201
Committed: https://opendev.org/openstack/tripleo-common/commit/5565ecdf68162f7a22f5c570e8d272a39f403826
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 5565ecdf68162f7a22f5c570e8d272a39f403826
Author: Cédric Jeanneret <email address hidden>
Date: Thu Feb 10 15:38:05 2022 +0100

    Ensure failures on the undercloud leads to a complete stop

    With tripleo_free strategy, we might hit situation where the undercloud
    is unreachable for some reasons, preventing external tasks to happen on
    the overcloud nodes.

    This is especially true for older releases, such as train, where
    mistral was still used in order to orchestrate the different playbooks
    and runs.

    Change-Id: I278fdc9597f83f1dc8390569be9716d3c8847dc4
    Closes-Bug: #1960518
    (cherry picked from commit 80b294ba884ca8ae67137a9885e02edef0b08d4e)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/829205
Committed: https://opendev.org/openstack/tripleo-common/commit/33f1eecb688855688cd2e88452294ede09590e83
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 33f1eecb688855688cd2e88452294ede09590e83
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 15 10:47:13 2022 +0100

    Ensure failures on the undercloud leads to a complete stop

    With tripleo_free strategy, we might hit situation where the undercloud
    is unreachable for some reasons, preventing external tasks to happen on
    the overcloud nodes.

    This is especially true for older releases, such as train, where
    mistral was still used in order to orchestrate the different playbooks
    and runs.

    Change-Id: I278fdc9597f83f1dc8390569be9716d3c8847dc4
    Closes-Bug: #1960518
    (cherry picked from commit 80b294ba884ca8ae67137a9885e02edef0b08d4e)
    (cherry picked from commit 5565ecdf68162f7a22f5c570e8d272a39f403826)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/tripleo-common/+/830711

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/830711
Committed: https://opendev.org/openstack/tripleo-common/commit/10b421198ebdd53dbc07e03deb258e519723cbd2
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 10b421198ebdd53dbc07e03deb258e519723cbd2
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 15 10:47:13 2022 +0100

    Ensure failures on the undercloud leads to a complete stop

    With tripleo_free strategy, we might hit situation where the undercloud
    is unreachable for some reasons, preventing external tasks to happen on
    the overcloud nodes.

    This is especially true for older releases, such as train, where
    mistral was still used in order to orchestrate the different playbooks
    and runs.

    Change-Id: I278fdc9597f83f1dc8390569be9716d3c8847dc4
    Closes-Bug: #1960518
    (cherry picked from commit 80b294ba884ca8ae67137a9885e02edef0b08d4e)
    (cherry picked from commit 5565ecdf68162f7a22f5c570e8d272a39f403826)
    (cherry picked from commit 33f1eecb688855688cd2e88452294ede09590e83)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/tripleo-common/+/830931

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 12.4.7

This issue was fixed in the openstack/tripleo-common 12.4.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/830931
Committed: https://opendev.org/openstack/tripleo-common/commit/b5ef9a568e7247aeb74ea20832be5d88a24bd124
Submitter: "Zuul (22348)"
Branch: stable/train

commit b5ef9a568e7247aeb74ea20832be5d88a24bd124
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 15 10:47:13 2022 +0100

    Ensure failures on the undercloud leads to a complete stop

    With tripleo_free strategy, we might hit situation where the undercloud
    is unreachable for some reasons, preventing external tasks to happen on
    the overcloud nodes.

    This is especially true for older releases, such as train, where
    mistral was still used in order to orchestrate the different playbooks
    and runs.

    Change-Id: I278fdc9597f83f1dc8390569be9716d3c8847dc4
    Closes-Bug: #1960518
    (cherry picked from commit 80b294ba884ca8ae67137a9885e02edef0b08d4e)
    (cherry picked from commit 5565ecdf68162f7a22f5c570e8d272a39f403826)
    (cherry picked from commit 33f1eecb688855688cd2e88452294ede09590e83)
    (cherry picked from commit 10b421198ebdd53dbc07e03deb258e519723cbd2)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 16.4.0

This issue was fixed in the openstack/tripleo-common 16.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 13.3.0

This issue was fixed in the openstack/tripleo-common 13.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common train-eol

This issue was fixed in the openstack/tripleo-common train-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.