Updating plans breaks deployment

Bug #1622683 reported by Dougal Matthews on 2016-09-12
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
tripleo
Critical
Dougal Matthews

Bug Description

A fix was rushed out to resolve https://bugs.launchpad.net/tripleo/+bug/1621462. It only ever passed CI because the default plan creation wasn't working - this meant that the plan was never being updated.

The problem can be seen here as it resolved the default plan, and then CI attempts to do an update to it and deploy which then fails. https://review.openstack.org/#/c/368760/

We need to:
1. Fix the plan updating
2. Land the fix for the default plan
3. Make sure CI is verifying the default plan and fails if it doesn't work.

Fix proposed to branch: master
Review: https://review.openstack.org/369247

Changed in tripleo:
status: Confirmed → In Progress
Dougal Matthews (d0ugal) wrote :

Number 3 should be resolved by: https://review.openstack.org/#/c/369247

Dougal Matthews (d0ugal) wrote :

Number 1 and 2 should be resolved by: https://review.openstack.org/#/c/368760/

Reviewed: https://review.openstack.org/368760
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=af27127508eabf2b6873713e5e1507fa92b5f5b3
Submitter: Jenkins
Branch: master

commit af27127508eabf2b6873713e5e1507fa92b5f5b3
Author: Dougal Matthews <email address hidden>
Date: Mon Sep 12 11:24:30 2016 +0000

    Add template processing to the update plan workflow.

    This was recently added to the create workflow, but as update
    was added at the same time it wasn't included.

    Partial-Bug: #1622683
    Closes-Bug: #1622720

    Change-Id: Ia014d85d7e601d436ae9267df5988f4a6e962574

Changed in tripleo:
assignee: Dougal Matthews (d0ugal) → Ben Nemec (bnemec)
Changed in tripleo:
assignee: Ben Nemec (bnemec) → Dougal Matthews (d0ugal)
Changed in tripleo:
milestone: newton-rc1 → newton-rc2
Boris Derzhavets (bderzhavets) wrote :

Rebuilding openstack-tripleo-common-5.0.1-0.20160917031337.15c97e6.el7.centos.src.rpm
with patch https://review.openstack.org/gitweb?p=openstack/tripleo-common.git;a=patch;h=203460176750aeda6c0a2d39ce349ad827053b11 doesn't fix for me situation described in
https://bugs.launchpad.net/tripleo/+bug/1622720

Boris Derzhavets (bderzhavets) wrote :

When I succeed on INSTACK running :-

  $ git clone https://github.com/openstack/tripleo-heat-templates
  $ git clone https://github.com/openstack-infra/tripleo-ci.git

  $ ./tripleo-ci/scripts/tripleo.sh --repo-setup
  $ ./tripleo-ci/scripts/tripleo.sh --undercloud
  $ source stackrc
  $ ./tripleo-ci/scripts/tripleo.sh --overcloud-images
  $ ./tripleo-ci/scripts/tripleo.sh --register-nodes
  $ ./tripleo-ci/scripts/tripleo.sh --introspect-nodes

  #!/bin/bash -x
   source /home/stack/stackrc
   openstack overcloud deploy \
    --libvirt-type qemu \
    --ntp-server pool.ntp.org \
    --templates /home/stack/tripleo-heat-templates \
    -e /home/stack/tripleo-heat-templates/overcloud-resource-registry-puppet.yaml \
    --control-scale 1 --compute-scale 2

and then issue

  $ source stackrc
  $ openstack stack delete overcloud

Applying workaround https://bugs.launchpad.net/tripleo/+bug/1622720/comments/1
to be able redeploy ( keeping same the number of overcloud nodes )

  $ mistral environment-delete overcloud
  $ swift delete --all

Workaround does work for me allowing successfully redeploy for instance :-

 #!/bin/bash -x
 source /home/stack/stackrc
 openstack overcloud deploy \
  --libvirt-type qemu \
  --ntp-server pool.ntp.org \
  --templates /home/stack/tripleo-heat-templates \
  -e /home/stack/tripleo-heat-templates/overcloud-resource-registry-puppet.yaml \
  -e /home/stack/tripleo-heat-templates/environments/storage-environment.yaml \
  --control-scale 1 --compute-scale 1 --ceph-storage-scale 1

Reviewed: https://review.openstack.org/369486
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=a7b2e5eb058736c3b84684377e2d89bdcd0d784b
Submitter: Jenkins
Branch: master

commit a7b2e5eb058736c3b84684377e2d89bdcd0d784b
Author: Dougal Matthews <email address hidden>
Date: Tue Sep 13 14:25:22 2016 +0000

    Remove the environments from Mistral when removing from Swift

    When the new filesa re re-created the environments will be re-added
    based on the capabilities map and the users choices.

    Partial-Bug: #1622683
    Depends-On: Ia014d85d7e601d436ae9267df5988f4a6e962574
    Change-Id: I1e1d6634663bf38fd21cb5f10f1422294321c5aa

Change abandoned by Dougal Matthews (<email address hidden>) on branch: master
Review: https://review.openstack.org/371468

Julie Pichon (jpichon) on 2016-09-20
tags: added: newton-backport-potential

Reviewed: https://review.openstack.org/371027
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=2d87e9640427f12eb5604f290fe38eae00f402ee
Submitter: Jenkins
Branch: master

commit 2d87e9640427f12eb5604f290fe38eae00f402ee
Author: Dougal Matthews <email address hidden>
Date: Fri Sep 16 09:15:43 2016 +0100

    Add template processing to the update plan workflow.

    This was recently added to the create workflow, but as update
    was added at the same time it wasn't included.

    Partial-Bug: #1622683
    Closes-Bug: #1622720
    Change-Id: I96ff13a00ac98bdfeb6605e9d300f8a44e6f82f1

Reviewed: https://review.openstack.org/373167
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=1748a854541230fa112aa725551dec7eebb9b47a
Submitter: Jenkins
Branch: stable/newton

commit 1748a854541230fa112aa725551dec7eebb9b47a
Author: Dougal Matthews <email address hidden>
Date: Tue Sep 13 14:25:22 2016 +0000

    Remove the environments from Mistral when removing from Swift

    When the new filesa re re-created the environments will be re-added
    based on the capabilities map and the users choices.

    Partial-Bug: #1622683
    Depends-On: Ia014d85d7e601d436ae9267df5988f4a6e962574
    Change-Id: I1e1d6634663bf38fd21cb5f10f1422294321c5aa
    (cherry picked from commit a7b2e5eb058736c3b84684377e2d89bdcd0d784b)

tags: added: in-stable-newton

Reviewed: https://review.openstack.org/369247
Committed: https://git.openstack.org/cgit/openstack/instack-undercloud/commit/?id=b6b2b562f03bbaddfd8b6612fb3afeff20315397
Submitter: Jenkins
Branch: master

commit b6b2b562f03bbaddfd8b6612fb3afeff20315397
Author: Dougal Matthews <email address hidden>
Date: Tue Sep 13 09:49:07 2016 +0100

    Verify that the Deployment Plan creation was successful

    The creation of the default deployment plan was recently broken but the error
    went unnoticed as CI never failed. This then led to another obscure bug as
    something worked due only to the default plan not existing.

    This change waits for the plan creation to finish and verifies that it
    completed without an error. A timeout is added, should, for some reason the
    creation ever stall.

    NOTE: This will not raise an error if plan creation fails, since it is
          currently broken. This will be fixed in this following patch:
          https://review.openstack.org/373446

    Partial-Bug: #1622683
    Change-Id: Id8001fee677fe321b42892264e3c55afb6790e1c

Reviewed: https://review.openstack.org/374037
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=54d814d653e7e4217696b80cfc67f4272a92e969
Submitter: Jenkins
Branch: master

commit 54d814d653e7e4217696b80cfc67f4272a92e969
Author: Dougal Matthews <email address hidden>
Date: Wed Sep 21 11:01:40 2016 +0000

    Use the passed in workflow when creating or updating a plan

    This was previously changed by mistake in a bad merge which
    caused th CLI to always call the create plan workflow.
    See: I9d77880e82ec429a2ea340df87c89055a09e9720

    Partial-Bug: #1622683
    Change-Id: I8967b8badfeee1ba9ad473171103c7c37d6f7209

Reviewed: https://review.openstack.org/374201
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=b270ec99e5727c7d6c3f869c244eeaa3a42d7c31
Submitter: Jenkins
Branch: stable/newton

commit b270ec99e5727c7d6c3f869c244eeaa3a42d7c31
Author: Dougal Matthews <email address hidden>
Date: Wed Sep 21 11:01:40 2016 +0000

    Use the passed in workflow when creating or updating a plan

    This was previously changed by mistake in a bad merge which
    caused th CLI to always call the create plan workflow.
    See: I9d77880e82ec429a2ea340df87c89055a09e9720

    Partial-Bug: #1622683
    Change-Id: I8967b8badfeee1ba9ad473171103c7c37d6f7209
    (cherry picked from commit 54d814d653e7e4217696b80cfc67f4272a92e969)

Reviewed: https://review.openstack.org/371347
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=f768bb8c6c2752d78b6d02853447971c363005f0
Submitter: Jenkins
Branch: master

commit f768bb8c6c2752d78b6d02853447971c363005f0
Author: Dougal Matthews <email address hidden>
Date: Fri Sep 16 09:09:51 2016 +0100

    Fix the default plan creation

    The task plan_process_templates was calling the plan_set_status_failed action,
    but this action returns the result of the create_plan action, not the result
    of plan_process_templates.

    It depends on a change in instack-undercloud that will ensure that the
    workflow finishes before we continue to avoid hitting Mistral bug #1624284

    Partial-Bug: #1622683
    Depends-On: I8967b8badfeee1ba9ad473171103c7c37d6f7209
    Change-Id: Ic494ed0874dd79e148ef9c7093bdcf32bed4d6ae

Reviewed: https://review.openstack.org/373446
Committed: https://git.openstack.org/cgit/openstack/instack-undercloud/commit/?id=873d17c4837e3768110494b709745d2ff1af4853
Submitter: Jenkins
Branch: master

commit 873d17c4837e3768110494b709745d2ff1af4853
Author: Dougal Matthews <email address hidden>
Date: Tue Sep 20 17:09:48 2016 +0100

    Ensure that the default plan was created successfully

    In change Id8001fee677fe321b42892264e3c55afb6790e1c we wait for
    the workflow to finish, but don't verify that it didn't error as
    there is an issue with the default plan creation. There is a fix
    for this which this patch depends on.

    This change also expands on the error messages so users can more
    easily debug the problem.

    Closes-Bug: #1622683
    Depends-On: Ic494ed0874dd79e148ef9c7093bdcf32bed4d6ae
    Change-Id: Iaaf528372b57c19480c47fe22213b27b8ca871a7

Changed in tripleo:
status: In Progress → Fix Released

This issue was fixed in the openstack/instack-undercloud 5.0.0.0rc2 release candidate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers