Ironic drivers that utilize temp files should verify they can actually create them

Bug #1428722 reported by Chris Krelle
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Low
Chris Krelle

Bug Description

Drivers that make use of temp files like the ipmitool driver should validate that a temp file can be created at startup to avoid errors like this:

2015-02-27 18:34:12.224 3119 WARNING ironic.drivers.modules.ipmitool [-] IPMI power status failed for node a8cb6624-0d9f-c882-affc-046ebb96ec01 with error: Failed to create the password file. Unexpected error while running command.
Command: ipmitool -I lanplus -H 192.168.122.1 -L ADMINISTRATOR -U root -R 12 -N 5 -f /tmp/tmpLNe1uk power status
Exit code: 1
Stdout: u''
Stderr: u'Error: Unable to establish IPMI v2 / RMCP+ session\nError: Unable to establish IPMI v2 / RMCP+ session\nError: Unable to establish IPMI v2 / RMCP+ session\nUnable to get Chassis Power Status\n'.
2015-02-27 18:34:12.225 3119 WARNING ironic.conductor.manager [-] During sync_power_state, could not get power state for node a8cb6624-0d9f-c882-affc-046ebb96ec01. Error: IPMI call failed: power status..
2015-02-27 18:34:12.231 3119 ERROR ironic.conductor.manager [-] During sync_power_state, max retries exceeded for node a8cb6624-0d9f-c882-affc-046ebb96ec01, node state None does not match expected state 'None'. Updating DB state to 'None' Switching node to maintenance mode.

Tags: driver
aeva black (tenbrae)
Changed in ironic:
status: New → Confirmed
importance: Undecided → Low
tags: added: driver
Changed in ironic:
assignee: nobody → Chris Krelle (nobodycam)
status: Confirmed → In Progress
Revision history for this message
John L. Villalovos (happycamp) wrote :

I think this bug is partially because of a try block which is too big. I have submitted a patch for that here:
https://review.openstack.org/#/c/161803/

Because of that the error message is misleading with:
error: Failed to create the password file. Unexpected error while running command.

I don't believe actually failed to create the password file. What happened was the call to ipmitool returned 1 and raised an exception which was then caught by the try block in the make_password_file function

Revision history for this message
John L. Villalovos (happycamp) wrote :

I think this bug is actually that a ProcessExecutionError exception occurred because the call to ipmitool returned 1.

The exception was caught by _make_password_file() because it yielded the path name. But the yield was inside a try block that any exception caught in that try block would be raised as a PasswordFileFailedToCreate exception.

We received the error:
error: Failed to create the password file. Unexpected error while running command.

"Failed to create the password file." from PasswordFileFailedToCreate exception
"Unexpected error while running command." from ProcessExecutionError

oslo_concurrency/processutils.py: description = _("Unexpected error while running command.")

error: Failed to create the password file. Unexpected error while running command.
Command: ipmitool -I lanplus -H 192.168.122.1 -L ADMINISTRATOR -U root -R 12 -N 5 -f /tmp/tmpLNe1uk power status
Exit code: 1
Stdout: u''
Stderr: u'Error: Unable to establish IPMI v2 / RMCP+ session\nError: Unable to establish IPMI v2 / RMCP+ session\nError: Unable to establish IPMI v2 / RMCP+ session\nUnable to get Chassis Power Status\n'.

I think this patch:
https://review.openstack.org/161803

Will fix the issue so that when a ProcessExecutionError occurs in _make_password_file() it will now get raised as a ProcessExecutionError exception.

Revision history for this message
Robert Collins (lifeless) wrote :

I think this bug presupposes the solution. Tmp dir permissions can change at any point - preflight checks won't prevent these problems, though they may make it easier to diagnose in the common case.

I suggest just making it easier to diagnose in all cases by annotating (raise_from) the error with any extra information we have.

Changed in ironic:
assignee: Chris Krelle (nobodycam) → John L. Villalovos (happycamp)
Changed in ironic:
assignee: John L. Villalovos (happycamp) → Chris Krelle (nobodycam)
Changed in ironic:
assignee: Chris Krelle (nobodycam) → John L. Villalovos (happycamp)
Changed in ironic:
assignee: John L. Villalovos (happycamp) → Chris Krelle (nobodycam)
Changed in ironic:
assignee: Chris Krelle (nobodycam) → John L. Villalovos (happycamp)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/162672
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=ca81ce22c11159a55ac3f33927d2954c1bc53999
Submitter: Jenkins
Branch: master

commit ca81ce22c11159a55ac3f33927d2954c1bc53999
Author: John L. Villalovos <email address hidden>
Date: Mon Mar 9 09:24:33 2015 -0700

    Update unittests and use NamedTemporaryFile

    Update the unittests for _make_password_file to text exception handling.
    Also switch to using tempfile.NamedTemporaryFile() to create the
    temporary file.

    The _make_password_file function was corrected so that if an exception
    occurred during the 'yield' portion, it would not catch the error and
    convert it into a PasswordFileFailedToCreate exception. The unit tests
    are being updated to test this new behavior and also test for when it
    should raise a PasswordFileFailedToCreate exception.

    Also it was recommended to use tempfile.NamedTemporaryFile() instead of
    tempfile.mkstemp()

    Partial-Bug: 1428722
    Change-Id: Iedb6baa97133d2aa8f88a4aaf2945403766d7fd5

Changed in ironic:
assignee: John L. Villalovos (happycamp) → Chris Krelle (nobodycam)
Changed in ironic:
assignee: Chris Krelle (nobodycam) → John L. Villalovos (happycamp)
Changed in ironic:
assignee: John L. Villalovos (happycamp) → Chris Krelle (nobodycam)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/160383
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=017734bcd917a83392e5ef1633a49a2c6446663e
Submitter: Jenkins
Branch: master

commit 017734bcd917a83392e5ef1633a49a2c6446663e
Author: Chris Krelle <email address hidden>
Date: Mon Mar 2 07:44:51 2015 -0800

    Check temp dir is usable for ipmitool driver

    This patch adds directory check functions to common/utils.py.
    It adds a call to the new check functions in ipmitool init.
    So we can test the temp directory, upon startup to ensure that
    the directory is usable.

    These pre-flight checks do not change the need for valid
    and accurate error handling, They are to quickly inform
    an operator that there is a configuration error that will
    prevent the driver from performing as expected.

    Also adds a missing unit test for the ipmitool drivers power
    validate function "test_power_validate".

    Change-Id: Iefc75203ccde1d25fd91c9cb815d871b068255a2
    Partial-Bug: #1428722

Revision history for this message
Chris Krelle (nobodycam) wrote :

I believe the above should correct the bug.. Please reopen if this is found to be in error.

Changed in ironic:
status: In Progress → Fix Committed
Changed in ironic:
milestone: none → 4.0.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.