Manual cleaning fails when using iDRAC driver

Bug #1691808 reported by Christopher Dearborn on 2017-05-18
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Christopher Dearborn

Bug Description

Performing RAID configuration deletion or creation via manual cleaning using the iDRAC frequently fails and results in the following error:

    ironic_client.node.wait_for_provision_state(node_uuid, 'manageable')
  File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 424, in wait_for_provision_state
    'actual': node.provision_state, 'error': node.last_error})
StateTransitionFailed: Node b2455185-e1df-4f11-9cdf-8202f93a826a failed to reach state manageable. It's in state cleaning, and has error: Node b2455185-e1df-4f11-9cdf-8202f93a826a failed step {u'interface': u'raid', u'step': u'delete_configuration'}: DRAC operation failed. Reason: Unfinished config jobs found: [Job(id='JID_951427449675', name='Export Configuration', start_time='NA', until_time='NA', message='Exporting System Configuration Profile XML file.', status='Running', percent_complete='10')]. Make sure they are completed before retrying.

To reproduce, delete the RAID configuration of a node using manual cleaning:

    clean_steps = [{'interface': 'raid', 'step': 'delete_configuration'}]
    ironic_client.node.set_provision_state(
        node_uuid,
        'clean',
        cleansteps=clean_steps)
    ironic_client.node.wait_for_provision_state(node_uuid, 'manageable')

Changed in ironic:
assignee: nobody → Christopher Dearborn (cdearbor)
Revision history for this message
Christopher Dearborn (cdearbor) wrote :

When the node is rebooted immediately following automatic cleaning, the iDRAC automatically creates and runs a config job called "Export Configuration". Until this job is completed, no other job can be created.

According to the iDRAC team, the correct fix for this is to wait until the iDRAC reports itself ready before attempting to create any config jobs.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/466086

Changed in ironic:
status: New → In Progress
Dmitry Tantsur (divius) on 2017-06-05
Changed in ironic:
importance: Undecided → High
tags: added: drac
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/466086
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=35f222c55deaa13c599de80eb9eaf0e4cbf8165e
Submitter: Jenkins
Branch: master

commit 35f222c55deaa13c599de80eb9eaf0e4cbf8165e
Author: Christopher Dearborn <email address hidden>
Date: Thu May 18 15:35:04 2017 -0400

    Wait until iDRAC is ready before out-of-band cleaning

    When out-of-band cleaning is initiated, the node is PXE booted and the
    ramdisk is loaded. After in-band cleaning completes, the node is
    rebooted. At that point, the iDRAC automatically creates and runs an
    "Export Configuration" job. Out-of-band cleaning then starts: either
    RAID configuration creation or deletion. If the export job has not
    finished by the time the RAID deletion or creation job is attempted to
    be created, then the RAID job creation fails.

    This patch causes RAID configuration creation and deletion to wait
    until the iDRAC declares itself to be ready before proceeding with
    out-of-band cleaning. This ensures that the export job has completed
    before creating another job.

    Change-Id: I79faba2206b86288ae636c46468a8b2dc321f979
    Closes-Bug: 1691808
    Depends-On: I929deada3dda7b09a6f29033fff89d9b0382aef8

Changed in ironic:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/481318

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ironic (master)

Reviewed: https://review.openstack.org/481318
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=e3753b3597e014203383e0c18a8d042caeea67a9
Submitter: Jenkins
Branch: master

commit e3753b3597e014203383e0c18a8d042caeea67a9
Author: Richard Pioso <email address hidden>
Date: Thu Jul 6 15:29:23 2017 -0400

    Revert "Wait until iDRAC is ready before out-of-band cleaning"

    The openstack/ironic project portion [0] of the fix to bug #1691808
    "Manual cleaning fails when using iDRAC driver" [1] is no longer needed.
    That is because the openstack/python-dracclient project has adopted
    library-wide the best practice that fix contains [2]. Therefore, this
    can be removed.

    [0] https://review.openstack.org/#/c/466086/
    [1] https://bugs.launchpad.net/ironic/+bug/1691808
    [2] https://bugs.launchpad.net/python-dracclient/+bug/1697558

    This reverts commit 35f222c55deaa13c599de80eb9eaf0e4cbf8165e.

    Change-Id: I1d313b750f931125567d1ca1c4ef1bbeb998746c
    Closes-Bug: #1702195
    Depends-On: Ied659a4ee45b1dd55cd3a420301d866d52c838fb
    Related-Bug: #1691808
    Related-Bug: #1697558

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic 9.0.0

This issue was fixed in the openstack/ironic 9.0.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers