Trying to power off nodes on registration is useless and may cause "node locked" problems

Bug #1638281 reported by Dmitry Tantsur
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Dmitry Tantsur

Bug Description

Currently we try to power off all nodes we enroll. This is useless as both ironic and ironic-inspector don't look at the power state. Even worse, powering off can take some time, which causes "node locked" errors when we try to "manage" the newly enrolled nodes. I suggest we drop it and move powering off to "provide" workflow.

Example: https://bugzilla.redhat.com/show_bug.cgi?id=1383627

Dmitry Tantsur (divius)
affects: ironic-inspector → tripleo-common
Revision history for this message
Dmitry Tantsur (divius) wrote :
affects: tripleo-common → tripleo
Changed in tripleo:
status: Triaged → In Progress
Dmitry Tantsur (divius)
tags: added: newton-backport-potential tripleo-common
Changed in tripleo:
milestone: none → ocata-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/392148
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=fee42cebe2768bf34101bcd5ebae5deaebd92e68
Submitter: Jenkins
Branch: master

commit fee42cebe2768bf34101bcd5ebae5deaebd92e68
Author: Dmitry Tantsur <email address hidden>
Date: Tue Nov 1 14:01:04 2016 +0100

    Power off new nodes when making them available, not right after enrolling

    Doing it for newly enrolled nodes that didn't go through "manage" action
    is wrong, because we only validate power credentials during this action.
    Also, this power off request can lock nodes for substantial time, causing
    the following "manage" workflow to fail with "node locked" error.

    Actually, both ironic and ironic-inspector can safely handle nodes that are
    powered on. However, this may be confusing for users looking at node list.

    This patch moves powering off to the "provide" workflow with proper wait
    and error handling.

    Change-Id: Ie2e5baa10adf67148d335d6127ff1dfe45e91968
    Closes-Bug: #1638281

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/394541

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/newton)

Reviewed: https://review.openstack.org/394541
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=5a7c0236373300f3f9f80a31853b393d07bac234
Submitter: Jenkins
Branch: stable/newton

commit 5a7c0236373300f3f9f80a31853b393d07bac234
Author: Dmitry Tantsur <email address hidden>
Date: Tue Nov 1 14:01:04 2016 +0100

    Power off new nodes when making them available, not right after enrolling

    Doing it for newly enrolled nodes that didn't go through "manage" action
    is wrong, because we only validate power credentials during this action.
    Also, this power off request can lock nodes for substantial time, causing
    the following "manage" workflow to fail with "node locked" error.

    Actually, both ironic and ironic-inspector can safely handle nodes that are
    powered on. However, this may be confusing for users looking at node list.

    This patch moves powering off to the "provide" workflow with proper wait
    and error handling.

    Change-Id: Ie2e5baa10adf67148d335d6127ff1dfe45e91968
    Closes-Bug: #1638281
    (cherry picked from commit fee42cebe2768bf34101bcd5ebae5deaebd92e68)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.4.0

This issue was fixed in the openstack/tripleo-common 5.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.5.0

This issue was fixed in the openstack/tripleo-common 5.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.4.0

This issue was fixed in the openstack/tripleo-common 5.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.