Intermittent ipset_manager test failures in Kilo

Bug #1442377 reported by Brian Haley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Brian Haley

Bug Description

It was brought to my attention that there is an intermittent ipset_manager test failure in the latest Kilo code, most likely caused by https://review.openstack.org/#/c/170328/

This is one from http://logs.openstack.org/46/149546/39/check/gate-neutron-python27/f903ce3/console.html :

2015-04-09 18:15:35.721 | ==============================
2015-04-09 18:15:35.721 | Failed 1 tests - output below:
2015-04-09 18:15:35.721 | ==============================
2015-04-09 18:15:35.721 |
2015-04-09 18:15:35.722 | neutron.tests.unit.agent.linux.test_ipset_manager.IpsetManagerTestCaseHashArgs.test_set_members_adding_more_than_5
2015-04-09 18:15:35.722 | ------------------------------------------------------------------------------------------------------------------
2015-04-09 18:15:35.722 |
2015-04-09 18:15:35.722 | Captured traceback:
2015-04-09 18:15:35.722 | ~~~~~~~~~~~~~~~~~~~
2015-04-09 18:15:35.722 | Traceback (most recent call last):
2015-04-09 18:15:35.722 | File "neutron/tests/unit/agent/linux/test_ipset_manager.py", line 135, in test_set_members_adding_more_than_5
2015-04-09 18:15:35.723 | self.verify_mock_calls()
2015-04-09 18:15:35.723 | File "neutron/tests/unit/agent/linux/test_ipset_manager.py", line 43, in verify_mock_calls
2015-04-09 18:15:35.723 | self.execute.assert_has_calls(self.expected_calls, any_order=False)
2015-04-09 18:15:35.723 | File "/home/jenkins/workspace/gate-neutron-python27/.tox/py27/local/lib/python2.7/site-packages/mock.py", line 863, in assert_has_calls
2015-04-09 18:15:35.723 | 'Actual: %r' % (calls, self.mock_calls)
2015-04-09 18:15:35.723 | AssertionError: Calls not found.
2015-04-09 18:15:35.724 | Expected: [call(['ipset', 'create', '-exist', 'IPv4fake_sgid', 'hash:ip', 'family', 'inet', 'hashsize', '2048', 'maxelem', '131072'], run_as_root=True, process_input=None), call(['ipset', 'restore', '-exist'], run_as_root=True, process_input='create IPv4fake_sgid-new hash:ip family inet hashsize 2048 maxelem 131072\nadd IPv4fake_sgid-new 10.0.0.1'), call(['ipset', 'swap', 'IPv4fake_sgid-new', 'IPv4fake_sgid'], run_as_root=True, process_input=None), call(['ipset', 'destroy', 'IPv4fake_sgid-new'], run_as_root=True, process_input=None), call(['ipset', 'restore', '-exist'], run_as_root=True, process_input='create IPv4fake_sgid-new hash:ip family inet hashsize 2048 maxelem 131072\nadd IPv4fake_sgid-new 10.0.0.1\nadd IPv4fake_sgid-new 10.0.0.2\nadd IPv4fake_sgid-new 10.0.0.3'), call(['ipset', 'swap', 'IPv4fake_sgid-new', 'IPv4fake_sgid'], run_as_root=True, process_input=None), call(['ipset', 'destroy', 'IPv4fake_sgid-new'], run_as_root=True, process_input=None)]
2015-04-09 18:15:35.724 | Actual: [call(['ipset', 'create', '-exist', 'IPv4fake_sgid', 'hash:ip', 'family', 'inet', 'hashsize', '2048', 'maxelem', '131072'], run_as_root=True, process_input=None),
2015-04-09 18:15:35.724 | call(['ipset', 'restore', '-exist'], run_as_root=True, process_input='create IPv4fake_sgid-new hash:ip family inet hashsize 2048 maxelem 131072\nadd IPv4fake_sgid-new 10.0.0.1'),
2015-04-09 18:15:35.724 | call(['ipset', 'swap', 'IPv4fake_sgid-new', 'IPv4fake_sgid'], run_as_root=True, process_input=None),
2015-04-09 18:15:35.724 | call(['ipset', 'destroy', 'IPv4fake_sgid-new'], run_as_root=True, process_input=None),
2015-04-09 18:15:35.724 | call(['ipset', 'add', '-exist', 'IPv4fake_sgid', '10.0.0.3'], run_as_root=True, process_input=None),
2015-04-09 18:15:35.725 | call(['ipset', 'add', '-exist', 'IPv4fake_sgid', '10.0.0.2'], run_as_root=True, process_input=None)]

I think the fix for this is to just move the self.ipset initializations from the base class into the sub-classes so they operate on their own data independently of each other. Patch forthcoming.

Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/172223

Changed in neutron:
status: New → In Progress
Revision history for this message
Brian Haley (brian-haley) wrote :

One easy way to reproduce this is to run with a smaller number of concurrent testr workers. I just hacked tools/pretty_tox.sh to have "--concurrency=2" right after "--subnunit" and it happens every time.

It does not happen if I back-up my tree to right before the test change to add "hash args" was added, but that's always (typically) running the tests in the order they appear in the file, where when the second test is there they are definitely interleaved more.

Revision history for this message
Brian Haley (brian-haley) wrote :

So I put a few debug statements in test_set_members_adding_more_than_5() to print-out the set, it's current IPs (after first was added), and the list of IPs we're going to add:

Captured stdout:
~~~~~~~~~~~~~~~~
    set: IPv4fake_sgid
    ip: 10.0.0.1
    FAKE_IPS: ['10.0.0.1', '10.0.0.2', '10.0.0.3']

That last one should have had 6 IP addresses in it as at the top of the file:

FAKE_IPS = ['10.0.0.1', '10.0.0.2', '10.0.0.3', '10.0.0.4',
            '10.0.0.5', '10.0.0.6']

So that's the problem, but why is that global data being modified?

Revision history for this message
Brian Haley (brian-haley) wrote :

Making that global an instance variable is the fix, I'll update the patch.

Changed in neutron:
assignee: Brian Haley (brian-haley) → Aman Kumar (amank)
Changed in neutron:
assignee: Aman Kumar (amank) → Brian Haley (brian-haley)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/172223
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=40a1f410ff45ce129c08da0cd071020c7ea338af
Submitter: Jenkins
Branch: master

commit 40a1f410ff45ce129c08da0cd071020c7ea338af
Author: Brian Haley <email address hidden>
Date: Thu Apr 9 17:48:40 2015 -0400

    Fix intermittent ipset_manager test failure

    Change ipset_manager _refresh_set() to make a copy of the list of
    IPs when creating a set, instead of using a reference, else any
    change to the set could update the caller's data.

    Also made the IpsetManagerTestCase classes always pass maxelem and
    hashsize to the parent class.

    Change-Id: I45fc716ab0952b80363b0c7dabae29cda05604dc
    Closes-bug: #1442377

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (neutron-pecan)

Fix proposed to branch: neutron-pecan
Review: https://review.openstack.org/185072

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.