On rebuild failure, nodes are left unrecoverable

Bug #1354437 reported by Chris Jones
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Chris Krelle

Bug Description

If one calls nova rebuild on a given node and the process fails to complete writing a new image, the node is placed into the ERROR state and it is impossible to recover via the API.

Given the relatively long window between issuing a rebuild command and the process completing, it seems likely that a variety of site-specific faults could cause a transient inability to complete the rebuild. I think it would be better for operators if they had a way to recover from that failure using the API.

To that end, I think that Ironic should not refuse to even attempt a rebuild on a node that is in the ERROR state.

Chris Krelle (nobodycam)
Changed in ironic:
assignee: nobody → Chris Krelle (nobodycam)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/114281

Changed in ironic:
status: New → In Progress
Ruby Loo (rloo)
Changed in ironic:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/114281
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=8ca29ebb1a78108dcf602891ed42267bbe14df29
Submitter: Jenkins
Branch: master

commit 8ca29ebb1a78108dcf602891ed42267bbe14df29
Author: Chris Krelle <email address hidden>
Date: Thu Aug 14 08:52:31 2014 -0700

    Allow rebuild of node in ERROR and DEPLOYFAIL state

    This patch allows nodes in ERROR or DEPLOYFAIL state to be rebuilt. This
    allows operators have a chance of recovering a node should it go into
    an error state while deploying or rebuilding.

    Change-Id: Ia7d6b3b796357da83fdb4d40f92f18a502956aa2
    Closes-Bug: #1354437

Changed in ironic:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in ironic:
milestone: none → juno-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ironic:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.