Spurious race conditions detected white-/black-listing MAC addresses in dnsmasq PXE filter

Bug #1741035 reported by milan k
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic Inspector
Fix Released
High
Ruby Loo

Bug Description

The discovery job has following traces in the log:

Jan 02 16:15:14.042391 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.pxe_filter.dnsmasq [-] Failed to blacklist 52:54:00:46:0a:b6; retrying next periodic sync time
Jan 02 16:15:14.043413 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.pxe_filter.dnsmasq [-] Failed to blacklist 52:54:00:4e:c0:19; retrying next periodic sync time
Jan 02 16:17:06.035076 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.pxe_filter.dnsmasq [-] Failed to whitelist 52:54:00:46:0a:b6; retrying next periodic sync time
Jan 02 16:17:06.035439 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.pxe_filter.dnsmasq [-] Failed to whitelist 52:54:00:4e:c0:19; retrying next periodic sync time

This doesn't seem correct since there's just a single instance of ironic-inspector running.
Even worse, the MAC address won't be blacklisted after the node discovery:

Jan 02 16:18:20.481513 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.plugins.discovery [None req-b2543414-65ee-4c59-9795-34b06416b325 None None] No BMC address provided, discovered node will be created without ipmi address: NotFoundInCacheError: Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:4e:c0:19', u'52:54:00:46:0a:b6']}
Jan 02 16:18:21.423341 ubuntu-xenial-inap-mtl01-0001653315 ironic-inspector[27211]: WARNING ironic_inspector.pxe_filter.dnsmasq [None req-b2543414-65ee-4c59-9795-34b06416b325 None None] Failed to blacklist 52:54:00:46:0a:b6; retrying next periodic sync time

The patch that introduced this behaviour: I2f7b8d3172f375cf65e759c9b881fcf41649c2f0

Dmitry Tantsur (divius)
Changed in ironic-inspector:
status: New → Confirmed
importance: Undecided → High
Changed in ironic-inspector:
assignee: nobody → Julia Kreger (juliaashleykreger)
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

This appears to actually be differing behavior vs python 2.7 and python 3.x.

https://github.com/python/cpython/blob/2.7/Objects/fileobject.c#L1820 basically always returns NoneType, however the code was written to expect a return value, which only works in python3.

Errors in python2 should become io errors, so the try/catch should still raise them without issues if we remove the python3 specific behavior.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-inspector (master)

Fix proposed to branch: master
Review: https://review.openstack.org/532579

Changed in ironic-inspector:
status: Confirmed → In Progress
Changed in ironic-inspector:
assignee: Julia Kreger (juliaashleykreger) → Ruby Loo (rloo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-inspector (master)

Reviewed: https://review.openstack.org/532579
Committed: https://git.openstack.org/cgit/openstack/ironic-inspector/commit/?id=b88823a771b23db73ab9a42dbfb5ce436c1e6e9e
Submitter: Zuul
Branch: master

commit b88823a771b23db73ab9a42dbfb5ce436c1e6e9e
Author: Julia Kreger <email address hidden>
Date: Tue Jan 9 17:11:44 2018 -0800

    Fix Py2/Py3 differences in write locking code

    In Py2, a file object write method returns NoneType.

    In Py3, a file object write method returns a count of
    bytes written, which can be used to interpret success.

    As this code, at least for now, still needs to work
    on Py2 without raising what appears to be errors, we
    need to remove the assumption that data is returned
    upon write.

    Should a write fail, IOError is still raised in Py2,
    which means the existing exception handling should be
    sufficent.

    Also, added an explicit flush for before we release the file
    lock, just to ensure that the data is actually written out of
    python's buffers before the lock is released.

    Change-Id: I1cae8f1cd2f7da39600d72a84fe041ff0a97e580
    Closes-Bug: #1741035

Changed in ironic-inspector:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic-inspector 7.1.0

This issue was fixed in the openstack/ironic-inspector 7.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.