openstack overcloud node clean reports all nodes were cleaned, but one node is stuck in clean_wait

Bug #1796293 reported by Derek Higgins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Derek Higgins

Bug Description

From: https://bugzilla.redhat.com/show_bug.cgi?id=1631454

(undercloud) [stack@hardprov-fx2-1 ~]$ openstack overcloud node clean --all-manageable
Waiting for messages on queue 'tripleo' with no timeout.

openstack overcloud node clean reports all nodes were cleaned, but one node is stuck in clean_wait

Environment:
python2-ironicclient-2.5.0-0.20180810135843.fb94fb8.el7ost.noarch
python2-ironic-inspector-client-3.3.0-0.20180810080932.53bf4e8.el7ost.noarch
puppet-ironic-13.3.1-0.20180831191239.61387eb.el7ost.noarch
instack-undercloud-9.3.1-0.20180831000258.e464799.el7ost.noarch

On BM setup ran:

(undercloud) [stack@hardprov-fx2-1 ~]$ openstack overcloud node clean --all-manageable
Waiting for messages on queue 'tripleo' with no timeout.

Cleaned 7 node(s)

(undercloud) [stack@hardprov-fx2-1 ~]$ openstack baremetal node list
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
| 67188f50-6daa-415a-83b9-0965219f5e99 | controller-0 | None | power off | manageable | False |
| dabcceea-2cdf-405a-ac90-dce370ee296e | controller-1 | None | power on | clean wait | False |
| b5a15206-d9a3-4c46-8f99-7024c022a713 | controller-2 | None | power off | manageable | False |
| 908453f7-a02d-45de-8ea5-849625e6d47e | compute-0 | None | power off | manageable | False |
| 7865d6d6-b4c3-4e0b-af21-bd8b9709fe87 | ceph-0 | None | power off | manageable | False |
| a788aa01-bb73-4ed5-9a91-ab9505f4477c | ironic-0 | None | power off | manageable | False |
| 644d3223-207c-4c3a-a070-84c3b1c9f052 | ironic-1 | None | power off | manageable | False |
+--------------------------------------+--------------+---------------+-------------+--------------------+-------------+

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/608260

Changed in tripleo:
assignee: nobody → Derek Higgins (derekh)
status: New → In Progress
Changed in tripleo:
importance: Undecided → High
milestone: none → stein-1
tags: added: rocky-backport-potential
tags: added: queens-backport-potential
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/608260
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=70ed6378186dca5735f7ff8f51190de7eb7bf3ca
Submitter: Zuul
Branch: master

commit 70ed6378186dca5735f7ff8f51190de7eb7bf3ca
Author: Derek Higgins <email address hidden>
Date: Fri Oct 5 14:20:22 2018 +0100

    Fail node cleaning on timeout

    The Use of a retry with continue-on causes the task
    wait_for_provision_state to finish in success. We need another
    task to test the provisioning state and conditionally fail
    based on that.

    Closes-Bug: #1796293
    Change-Id: I94fe438a05c3d20b927f9fe1bc8cc3ea10d85f1e

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/628968

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/628971

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/rocky)

Reviewed: https://review.openstack.org/628968
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=0e1368bfedf82b20dfb881868a89652cbfd83163
Submitter: Zuul
Branch: stable/rocky

commit 0e1368bfedf82b20dfb881868a89652cbfd83163
Author: Derek Higgins <email address hidden>
Date: Fri Oct 5 14:20:22 2018 +0100

    Fail node cleaning on timeout

    The Use of a retry with continue-on causes the task
    wait_for_provision_state to finish in success. We need another
    task to test the provisioning state and conditionally fail
    based on that.

    Closes-Bug: #1796293
    Change-Id: I94fe438a05c3d20b927f9fe1bc8cc3ea10d85f1e
    (cherry picked from commit 70ed6378186dca5735f7ff8f51190de7eb7bf3ca)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 10.3.0

This issue was fixed in the openstack/tripleo-common 10.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/queens)

Reviewed: https://review.openstack.org/628971
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=ae2d4b71d506a226e7268f216bc6b9c3e0168233
Submitter: Zuul
Branch: stable/queens

commit ae2d4b71d506a226e7268f216bc6b9c3e0168233
Author: Derek Higgins <email address hidden>
Date: Fri Oct 5 14:20:22 2018 +0100

    Fail node cleaning on timeout

    The Use of a retry with continue-on causes the task
    wait_for_provision_state to finish in success. We need another
    task to test the provisioning state and conditionally fail
    based on that.

    Closes-Bug: #1796293
    Change-Id: I94fe438a05c3d20b927f9fe1bc8cc3ea10d85f1e
    (cherry picked from commit 70ed6378186dca5735f7ff8f51190de7eb7bf3ca)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 8.6.7

This issue was fixed in the openstack/tripleo-common 8.6.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 9.5.0

This issue was fixed in the openstack/tripleo-common 9.5.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.