intermittent iboot command failures

Bug #1490760 reported by Dan Prince
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Dan Prince

Bug Description

Not often... but more so recently I see intermittent network issues when using Ironic with the iboot driver. Things like this:

----
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: task.driver.power.set_power_state(task, new_state)
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 135, in wrapper
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: return f(*args, **kwargs)
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/iboot.py", line 197, in set_power_state
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: _switch(driver_info, False)
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/iboot.py", line 99, in _switch
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: return conn.switch(relay_id, enabled)
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/iboot/iboot.py", line 240, in switch
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: return request.do_request()
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/iboot/iboot.py", line 74, in do_request
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: header = self._build_header()
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/iboot/iboot.py", line 48, in _build_header
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: self.interface.get_seq_num())
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: File "/usr/lib/python2.7/site-packages/iboot/iboot.py", line 196, in get_seq_num
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: self.seq_num += 1
Aug 31 17:41:43 instack.localdomain ironic-conductor[26808]: TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int'

-----

The actual issues seem to occur when using either the get_power_state or set_power_state functions on the iboot driver (it occurs occasionally with both of these).

A simple retry like we have for other power drivers would really help to avoid these sorts of issues entirely I think.

Dan Prince (dan-prince)
Changed in ironic:
assignee: nobody → Dan Prince (dan-prince)
importance: Undecided → High
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/219051

Changed in ironic:
status: Triaged → In Progress
Changed in ironic:
assignee: Dan Prince (dan-prince) → Lucas Alvares Gomes (lucasagomes)
Changed in ironic:
assignee: Lucas Alvares Gomes (lucasagomes) → Dan Prince (dan-prince)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/219051
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=3436c193c58ac7eb386d1af223f69cd999feafad
Submitter: Jenkins
Branch: master

commit 3436c193c58ac7eb386d1af223f69cd999feafad
Author: Dan Prince <email address hidden>
Date: Mon Aug 31 18:27:17 2015 -0400

    Add retry options to iBoot power driver

    This patch adds two new iBoot specific options to
    facilitate a max_retry and retry_interval around
    the internal conn.switch() and conn.get_relays()
    functions. This should help avoid provisioning
    failures if there is any sort of network blip
    or failure that may occasionally cause an iboot
    command to fail.

    Change-Id: I3b7813ae56dd3b18008f814cd6272d801dd6f274
    Closes-bug: #1490760

Changed in ironic:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in ironic:
milestone: none → 4.2.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.