Race sometimes causes successful deployments to return failed

Bug #1842987 reported by Steve Baker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Steve Baker

Bug Description

These lines[1] have a race condition where the fetched execution is state:SUCCESS but the last payload is status:RUNNING. This leads to the status:SUCCESS payload never being returned to the caller, and the deployment failing even though it succeeded.

Even though triggering this seems to be rare, it does happen. And the same issue appears to go back to Queens.

[1] https://opendev.org/openstack/python-tripleoclient/src/branch/master/tripleoclient/workflows/base.py#L94-L95

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (master)

Reviewed: https://review.opendev.org/679873
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=b5b5cab61da98c8bcf2c4e52a6d8ce0108dcfc64
Submitter: Zuul
Branch: master

commit b5b5cab61da98c8bcf2c4e52a6d8ce0108dcfc64
Author: Steve Baker <email address hidden>
Date: Wed Sep 4 10:09:04 2019 +1200

    Fix race in execution finishing

    An execution state can go from RUNNING to SUCCESS between fetching the
    last message from the websocket and polling the execution state. This
    means the SUCCESS payload is never returned and the overcloud
    deployment fails at the end with no indication as to why.

    This change turns the output of the execution into the last payload,
    allowing the calling SUCCESS logic to run.

    Change-Id: Ic22021ba9a2717de199629e361c656e2f562fb38
    Closes-Bug: #1842987

Changed in tripleo:
status: In Progress → Fix Released
tags: added: queens-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/681243

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (stable/stein)

Reviewed: https://review.opendev.org/681243
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=b0c714e3226b585751988cefdbed9a84e4df0599
Submitter: Zuul
Branch: stable/stein

commit b0c714e3226b585751988cefdbed9a84e4df0599
Author: Steve Baker <email address hidden>
Date: Wed Sep 4 10:09:04 2019 +1200

    Fix race in execution finishing

    An execution state can go from RUNNING to SUCCESS between fetching the
    last message from the websocket and polling the execution state. This
    means the SUCCESS payload is never returned and the overcloud
    deployment fails at the end with no indication as to why.

    This change turns the output of the execution into the last payload,
    allowing the calling SUCCESS logic to run.

    Change-Id: Ic22021ba9a2717de199629e361c656e2f562fb38
    Closes-Bug: #1842987
    (cherry picked from commit b5b5cab61da98c8bcf2c4e52a6d8ce0108dcfc64)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 12.2.0

This issue was fixed in the openstack/python-tripleoclient 12.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 11.5.2

This issue was fixed in the openstack/python-tripleoclient 11.5.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.