openstack overcloud failures|status sometimes shows incorrect output ( from deployment process)

Bug #1794277 reported by James Slagle
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
James Slagle

Bug Description

Description of problem:

I logged into env from two ssh terminals:

in one terminal I run opentack overcloud failures twice.
first attempt - failed by timeout

after that I run it again.
in another terminal I run script for re-deployment of overcloud.
and as you can see from screen[1]:
at first terminal where I run openstack failures list I see part of stdout in "tail -F mode" from another terminal where I run overcloud deploy

The same situation for openstack overcloud status

Version-Release number of selected component (if applicable):

How reproducible:
if runs openstack overcloud commands in parallel with overcloud deployments

Steps to Reproduce:
1. start overcloud deployment
2. open new terminal and run openstakc overcloud status or overcloud failures

Actual results:
output from overcloud deployment instead of failures or overcloud status

Expected results:
failures or overcloud status

Additional info:

Changed in tripleo:
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → James Slagle (james-slagle)
milestone: none → stein-1
tags: added: rocky-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/605058

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/605520

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/606064

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/605058
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=ec2e018457fd787337a7c1951672f6e1c3bfd8bb
Submitter: Zuul
Branch: master

commit ec2e018457fd787337a7c1951672f6e1c3bfd8bb
Author: James Slagle <email address hidden>
Date: Tue Sep 25 08:29:17 2018 -0400

    Use sync action get_deployment_failures

    Instead of using the workflow, which has the extra overhead of opening a
    websocket and polling for messages and the workflow result, just use the
    mistral action for get_deployment_failures directly. This is much
    simpler.

    It also fixes a bug where messages from other running workflows using
    the same "tripleo" queue were polluting the output of the "overcloud
    failures" command.

    Change-Id: Ie774a5698515ba7e43dc0755042fb476eb241fc7
    Closes-Bug: #1794277

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/612165

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (stable/rocky)

Reviewed: https://review.openstack.org/612165
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=a57bd4868d1b9399aa4f61870f40ea412069ebd1
Submitter: Zuul
Branch: stable/rocky

commit a57bd4868d1b9399aa4f61870f40ea412069ebd1
Author: James Slagle <email address hidden>
Date: Tue Sep 25 08:29:17 2018 -0400

    Use sync action get_deployment_failures

    Instead of using the workflow, which has the extra overhead of opening a
    websocket and polling for messages and the workflow result, just use the
    mistral action for get_deployment_failures directly. This is much
    simpler.

    It also fixes a bug where messages from other running workflows using
    the same "tripleo" queue were polluting the output of the "overcloud
    failures" command.

    Change-Id: Ie774a5698515ba7e43dc0755042fb476eb241fc7
    Closes-Bug: #1794277
    (cherry picked from commit ec2e018457fd787337a7c1951672f6e1c3bfd8bb)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 11.1.0

This issue was fixed in the openstack/python-tripleoclient 11.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/606064
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=678d56461961444d64e9cf731fc04b25babaad05
Submitter: Zuul
Branch: master

commit 678d56461961444d64e9cf731fc04b25babaad05
Author: James Slagle <email address hidden>
Date: Thu Sep 27 19:17:08 2018 -0400

    Pass execution_id to tripleo.ansible-playbook.

    Passing the execution_id to the tripleo.ansible-playbook action will
    make it such that the execution_id is included in any messages sent on
    the queue.

    This is needed so that when tripleoclient filters by execution id to
    discard messages that are not from workflows it did not start, won't be
    shown.

    The tripleoclient patch to filter on execution_id is
    https://review.openstack.org/#/c/605520/, but first we must land this
    patch so that execution_id is added an input to these actions.

    Change-Id: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Related-Bug: #1794277

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/605520
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=339c1f334c5b4dba7a95740e35b63aab931601a1
Submitter: Zuul
Branch: master

commit 339c1f334c5b4dba7a95740e35b63aab931601a1
Author: James Slagle <email address hidden>
Date: Wed Sep 26 16:08:10 2018 -0400

    Filter messages not from waiting execution

    The convention is to use the same queue name ("tripleo") for all
    workflows. This can lead to messages showing from other tripleoclient
    triggered workflows showing up during message polling if multiple
    workflows are running at the same time.

    This patch adds a check that will filter out any messages that do not
    belong to the execution that is being waited on by comparing the
    execution id with the root_execution_id returned in the execution
    payload.

    Depends-On: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Change-Id: Ie6473d6a1044cdf76552d62645b4d63da2df9398
    Related-Bug: #1794277

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/642919

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (stable/queens)

Reviewed: https://review.openstack.org/642919
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=5a996ffa7851e89b9b8a0198a6bacf2a0d45fa0a
Submitter: Zuul
Branch: stable/queens

commit 5a996ffa7851e89b9b8a0198a6bacf2a0d45fa0a
Author: Luke Short <email address hidden>
Date: Wed Mar 13 09:29:02 2019 -0400

    Use sync action get_deployment_failures

    Instead of using the workflow, which has the extra overhead of opening a
    websocket and polling for messages and the workflow result, just use the
    mistral action for get_deployment_failures directly. This is much
    simpler.

    It also fixes a bug where messages from other running workflows using
    the same "tripleo" queue were polluting the output of the "overcloud
    failures" command.

    Conflicts:
        tripleoclient/workflows/deployment.py

    Change-Id: Ie774a5698515ba7e43dc0755042fb476eb241fc7
    Closes-Bug: #1794277
    (cherry picked from commit ec2e018457fd787337a7c1951672f6e1c3bfd8bb)
    (cherry picked from commit a57bd4868d1b9399aa4f61870f40ea412069ebd1)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 10.6.1

This issue was fixed in the openstack/python-tripleoclient 10.6.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/663656

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/663687

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/663876

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/663879

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/rocky)

Reviewed: https://review.opendev.org/663876
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=b7da114421f655b204554195ece4912def1fe0e4
Submitter: Zuul
Branch: stable/rocky

commit b7da114421f655b204554195ece4912def1fe0e4
Author: James Slagle <email address hidden>
Date: Thu Sep 27 19:17:08 2018 -0400

    Pass execution_id to tripleo.ansible-playbook.

    Passing the execution_id to the tripleo.ansible-playbook action will
    make it such that the execution_id is included in any messages sent on
    the queue.

    This is needed so that when tripleoclient filters by execution id to
    discard messages that are not from workflows it did not start, won't be
    shown.

    The tripleoclient patch to filter on execution_id is
    https://review.openstack.org/#/c/605520/, but first we must land this
    patch so that execution_id is added an input to these actions.

    Change-Id: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Related-Bug: #1794277
    (cherry picked from commit 678d56461961444d64e9cf731fc04b25babaad05)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 9.3.0

This issue was fixed in the openstack/python-tripleoclient 9.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (stable/queens)

Change abandoned by Alex Schultz (<email address hidden>) on branch: stable/queens
Review: https://review.opendev.org/663879
Reason: this is blocking the stable/rocky backports. let's restore this after the rocky bits have fully landed

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (stable/rocky)

Reviewed: https://review.opendev.org/663656
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=e9849d04c7d234b7289e387f9acee0aa1fbac028
Submitter: Zuul
Branch: stable/rocky

commit e9849d04c7d234b7289e387f9acee0aa1fbac028
Author: James Slagle <email address hidden>
Date: Wed Sep 26 16:08:10 2018 -0400

    Filter messages not from waiting execution

    The convention is to use the same queue name ("tripleo") for all
    workflows. This can lead to messages showing from other tripleoclient
    triggered workflows showing up during message polling if multiple
    workflows are running at the same time.

    This patch adds a check that will filter out any messages that do not
    belong to the execution that is being waited on by comparing the
    execution id with the root_execution_id returned in the execution
    payload.

    Depends-On: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Change-Id: Ie6473d6a1044cdf76552d62645b4d63da2df9398
    Related-Bug: #1794277
    (cherry picked from commit 339c1f334c5b4dba7a95740e35b63aab931601a1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/queens)

Reviewed: https://review.opendev.org/663879
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=ff147b237bbcd10549c1931408155dfa4ce78587
Submitter: Zuul
Branch: stable/queens

commit ff147b237bbcd10549c1931408155dfa4ce78587
Author: James Slagle <email address hidden>
Date: Thu Sep 27 19:17:08 2018 -0400

    Pass execution_id to tripleo.ansible-playbook.

    Passing the execution_id to the tripleo.ansible-playbook action will
    make it such that the execution_id is included in any messages sent on
    the queue.

    This is needed so that when tripleoclient filters by execution id to
    discard messages that are not from workflows it did not start, won't be
    shown.

    The tripleoclient patch to filter on execution_id is
    https://review.openstack.org/#/c/605520/, but first we must land this
    patch so that execution_id is added an input to these actions.

    Change-Id: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Related-Bug: #1794277
    (cherry picked from commit 678d56461961444d64e9cf731fc04b25babaad05)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (stable/queens)

Reviewed: https://review.opendev.org/663687
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=074bb6e3a06f1d49a21506ffbd16b521db462c9c
Submitter: Zuul
Branch: stable/queens

commit 074bb6e3a06f1d49a21506ffbd16b521db462c9c
Author: James Slagle <email address hidden>
Date: Wed Sep 26 16:08:10 2018 -0400

    Filter messages not from waiting execution

    The convention is to use the same queue name ("tripleo") for all
    workflows. This can lead to messages showing from other tripleoclient
    triggered workflows showing up during message polling if multiple
    workflows are running at the same time.

    This patch adds a check that will filter out any messages that do not
    belong to the execution that is being waited on by comparing the
    execution id with the root_execution_id returned in the execution
    payload.

    Depends-On: Icbe80c338d69efc6ce8fceb0f73f833bec588536
    Change-Id: Ie6473d6a1044cdf76552d62645b4d63da2df9398
    Related-Bug: #1794277
    (cherry picked from commit 339c1f334c5b4dba7a95740e35b63aab931601a1)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.