Unable to delete overcloud node

Bug #1626736 reported by James Slagle
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Ryan Brady

Bug Description

Description of problem:
Deleting an overcloud node fails with:
Two objects are equal when all of the attributes are equal, if you want to identify whether two objects are same one with same id, please use is_same_obj() function.
<urlopen error [Errno 2] No such file or directory: '/usr/share/openstack-tripleo-heat-templates/overcloud-without-mergepy.yaml'>

Version-Release number of selected component (if applicable):
python-tripleoclient-5.0.0-0.20160907170033.b0d7ce7.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud deploy --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
--control-scale 3 \
--control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 \
--compute-scale 1 \
--compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 \
--ceph-storage-scale 1 \
--ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 \
--ntp-server clock.redhat.com

2. Scale out with additional compute node:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud deploy --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
--control-scale 3 \
--control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 \
--compute-scale 2 \
--compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 \
--ceph-storage-scale 1 \
--ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 \
--ntp-server clock.redhat.com

3. Delete one compute node:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud node delete --stack overcloud --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
6fc44adf-9a46-41ad-af33-c623011e1457

Actual results:
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
deleting nodes [u'6fc44adf-9a46-41ad-af33-c623011e1457'] from stack overcloud
Two objects are equal when all of the attributes are equal, if you want to identify whether two objects are same one with same id, please use is_same_obj() function.
<urlopen error [Errno 2] No such file or directory: '/usr/share/openstack-tripleo-heat-templates/overcloud-without-mergepy.yaml'>

Expected results:
The node gets deleted.

Revision history for this message
James Slagle (james-slagle) wrote :

The message:
 Two objects are equal when all of the attributes are equal, if you want to identify whether two objects are same one with same id, please use is_same_obj() function.
is strange, but it looks like it's just a warning from python-heatclient:

https://github.com/openstack/python-heatclient/blob/master/heatclient/openstack/common/apiclient/base.py#L526

I'd gather the failing error is
 <urlopen error [Errno 2] No such file or directory: '/usr/share/openstack-tripleo-heat-templates/overcloud-without-mergepy.yaml'>

Changed in tripleo:
importance: Undecided → Critical
milestone: none → newton-rc2
assignee: nobody → Carlos Camacho (ccamacho)
status: New → Confirmed
Ryan Brady (rbrady)
tags: added: workflows
Revision history for this message
Jason E. Rist (jason-rist) wrote :

This also affects the UI.

tags: added: ui
removed: workflows
Revision history for this message
James Slagle (james-slagle) wrote :

in general, what needs to happen here is that the scale_down function in scale.py from tripleo_common needs to get updated to use the templates from the plan, and the logic moved into a workflow so that the scale down functionality is available from the API/UI as well.

Changed in tripleo:
assignee: Carlos Camacho (ccamacho) → Ryan Brady (rbrady)
tags: added: workflows
Revision history for this message
James Slagle (james-slagle) wrote :

i fixed the TEMPLATE_NAME variable in tripleo_common/constants.py, and downloaded my plan from swift, and pointed the --templates arg during the delete command at the plan directory, and the node delete finished successfully.

that's good, we should just be able to move this existing functionality into a workflow.

however, the command does not give feedback about the stack update that removes the node, it just exits immediately, so that could be improved to show the typical polling output.

Changed in tripleo:
status: Confirmed → Triaged
Changed in tripleo:
milestone: newton-rc2 → ocata-1
tags: added: newton-backport-potential
Changed in tripleo:
milestone: ocata-1 → newton-rc3
Revision history for this message
James Slagle (james-slagle) wrote :

according to jrist it is not a strict requirement from the UI's perspective that we have an API/workflow for this for newton.

However, it does need to work with the new plan workflows. So, when a node(s) are deleted, the plan needs to be updated in swift accordingly, the stack-update done using the plan.

This is especially important when considering a stack-update after a scale down...we don't want a subsequent stack-update to scale the stack back up to the previous <Role>Count parameter values. So, it's important that the plan is updated accordingly based on the scale down operation and remains the source of truth for the templates and environment files for the deployed stack.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/382707

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/383731

Changed in tripleo:
assignee: Ryan Brady (rbrady) → Dougal Matthews (d0ugal)
Changed in tripleo:
assignee: Dougal Matthews (d0ugal) → Ryan Brady (rbrady)
Changed in tripleo:
assignee: Ryan Brady (rbrady) → Dougal Matthews (d0ugal)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/382707
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=28edd93db62e79bc6916b6bd8f4fc67d99c48dd1
Submitter: Jenkins
Branch: master

commit 28edd93db62e79bc6916b6bd8f4fc67d99c48dd1
Author: Ryan Brady <email address hidden>
Date: Thu Oct 6 07:22:46 2016 -0400

    Port Scale Down Functionality into Workflow

    This patch ports the logic of the ScaleDownManager into an action
    and an associated workflow to expose it.

    Partial-Bug: #1626736
    Co-Authored-By: Dougal Matthews <email address hidden>
    Change-Id: Ia31a03c8f0a99dd92b18ffbf76d689b037ef8a78

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/387678

Changed in tripleo:
milestone: newton-rc3 → ocata-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/newton)

Reviewed: https://review.openstack.org/387678
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=f3bdd365eb176f2ecf9971aee1e4d7d04681d60f
Submitter: Jenkins
Branch: stable/newton

commit f3bdd365eb176f2ecf9971aee1e4d7d04681d60f
Author: Ryan Brady <email address hidden>
Date: Thu Oct 6 07:22:46 2016 -0400

    Port Scale Down Functionality into Workflow

    This patch ports the logic of the ScaleDownManager into an action
    and an associated workflow to expose it.

    Partial-Bug: #1626736
    Co-Authored-By: Dougal Matthews <email address hidden>
    Change-Id: Ia31a03c8f0a99dd92b18ffbf76d689b037ef8a78
    (cherry picked from commit 28edd93db62e79bc6916b6bd8f4fc67d99c48dd1)

tags: added: in-stable-newton
Changed in tripleo:
assignee: Dougal Matthews (d0ugal) → Ryan Brady (rbrady)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/383731
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=a85ad62d853c686c04c38388fb880c8aa5572021
Submitter: Jenkins
Branch: master

commit a85ad62d853c686c04c38388fb880c8aa5572021
Author: Ryan Brady <email address hidden>
Date: Thu Oct 6 14:04:58 2016 -0400

    Use workflow for overcloud node delete

    This patch adds support for using a workflow in tripleo-common
    for deleting nodes from a stack (scale down).

    Change-Id: Ia65734273d70ea0ae30d96122728950e1f0217b8
    Partial-Bug: #1626736

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/390590

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (stable/newton)

Reviewed: https://review.openstack.org/390590
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=3ccb9d2605f84f2b0b02eb93aafb3eb662bcd29f
Submitter: Jenkins
Branch: stable/newton

commit 3ccb9d2605f84f2b0b02eb93aafb3eb662bcd29f
Author: Ryan Brady <email address hidden>
Date: Thu Oct 6 14:04:58 2016 -0400

    Use workflow for overcloud node delete

    This patch adds support for using a workflow in tripleo-common
    for deleting nodes from a stack (scale down).

    Change-Id: Ia65734273d70ea0ae30d96122728950e1f0217b8
    Partial-Bug: #1626736
    (cherry picked from commit a85ad62d853c686c04c38388fb880c8aa5572021)

Ryan Brady (rbrady)
Changed in tripleo:
status: In Progress → Fix Committed
Changed in tripleo:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.