Periodic tasks share instance-specific data in a class attribute

Bug #1275151 reported by Ben Nemec
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo-incubator
Fix Released
High
Ben Nemec

Bug Description

On a fairly consistent basis, the periodic tasks test cases fail when run in parallel. For example:

2014-01-31 20:13:31.925 | FAIL: tests.unit.test_periodic.PeriodicTasksTestCase.test_called_twice
2014-01-31 20:13:31.925 | tags: worker-1
2014-01-31 20:13:31.926 | ----------------------------------------------------------------------
2014-01-31 20:13:31.926 | Empty attachments:
2014-01-31 20:13:31.926 | stderr
2014-01-31 20:13:31.926 | stdout
2014-01-31 20:13:31.926 |
2014-01-31 20:13:31.926 | pythonlogging:'': {{{
2014-01-31 20:13:31.926 | ERROR [openstack.common.periodic_task] Error during AService.crashit: urg
2014-01-31 20:13:31.927 | Traceback (most recent call last):
2014-01-31 20:13:31.927 | File "openstack/common/periodic_task.py", line 180, in run_periodic_tasks
2014-01-31 20:13:31.927 | task(self, context)
2014-01-31 20:13:31.927 | File "tests/unit/test_periodic.py", line 46, in crashit
2014-01-31 20:13:31.927 | raise AnException('urg')
2014-01-31 20:13:31.927 | AnException: urg
2014-01-31 20:13:31.927 | ERROR [openstack.common.periodic_task] Error during AService.crashit: urg
2014-01-31 20:13:31.928 | Traceback (most recent call last):
2014-01-31 20:13:31.928 | File "openstack/common/periodic_task.py", line 180, in run_periodic_tasks
2014-01-31 20:13:31.928 | task(self, context)
2014-01-31 20:13:31.928 | File "tests/unit/test_periodic.py", line 46, in crashit
2014-01-31 20:13:31.928 | raise AnException('urg')
2014-01-31 20:13:31.928 | AnException: urg
2014-01-31 20:13:31.928 | }}}
2014-01-31 20:13:31.929 |
2014-01-31 20:13:31.929 | Traceback (most recent call last):
2014-01-31 20:13:31.929 | File "tests/unit/test_periodic.py", line 68, in test_called_twice
2014-01-31 20:13:31.929 | self.assertTrue(len(serv.called) == 3)
2014-01-31 20:13:31.929 | File "/usr/lib/python2.7/unittest/case.py", line 420, in assertTrue
2014-01-31 20:13:31.929 | raise self.failureException(msg)
2014-01-31 20:13:31.929 | AssertionError: False is not true

I have spent a fair amount of time trying to track down the problem and have found that I can only reproduce this if I run the entire test_periodic.py file in parallel. Running just the failing set of tests or trying to narrow it down in any other way resulted in no failures.

The unit tests that are failing are not particularly good in the first place, so I'll probably go ahead and submit a change to improve them, but I still don't have a full solution to the parallel problem.

Ben Nemec (bnemec)
Changed in oslo:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Ben Nemec (bnemec) wrote :

I believe I have tracked this down to some data that is specific to an instance of a periodic task class but is stored in a class attribute so it leaks out to all other instances of the same class. I'm unsure whether this is a problem in actual use of this code, but it's unintuitive and wrong IMHO so I'm going to propose a change.

summary: - Periodic tasks unit tests fail when run in parallel
+ Periodic tasks share instance-specific data in a class attribute
Changed in oslo:
assignee: nobody → Ben Nemec (bnemec)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo-incubator (master)

Reviewed: https://review.openstack.org/70484
Committed: https://git.openstack.org/cgit/openstack/oslo-incubator/commit/?id=47c9d60657be4ad58afa3c7e3ef88885a595c4c3
Submitter: Jenkins
Branch: master

commit 47c9d60657be4ad58afa3c7e3ef88885a595c4c3
Author: Ben Nemec <email address hidden>
Date: Sat Feb 1 00:41:07 2014 +0000

    Don't share periodic_task instance data in a class attr

    The time of the last task run for a given periodic task was being
    stored in a class attribute, which meant all instances of a class
    shared the same last run time regardless of when they were
    actually run. This may not have been a problem in actual use of
    the code because I believe in practice periodic task classes
    are created only once, but it broke the unit tests because
    subsequent runs of a task might behave differently depending on
    whether a previous test had already run a task.

    To me, this behavior is unintuitive and wrong so it should be
    changed.

    This commit also fixes the test cases to properly verify whether
    the number of task runs expected actually happened. The improper
    call validation was the only reason these tests passed before.

    Change-Id: I78fc249fd33bc41f3fa8f301414bc60463cadb07
    Closes-Bug: #1275151

Changed in oslo:
status: Confirmed → Fix Committed
Changed in oslo:
milestone: none → icehouse-3
Thierry Carrez (ttx)
Changed in oslo:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo-incubator (master)

Reviewed: https://review.openstack.org/72071
Committed: https://git.openstack.org/cgit/openstack/oslo-incubator/commit/?id=0dbcae1fdfecaba1fea27559a238d58390c78ca2
Submitter: Jenkins
Branch: master

commit 0dbcae1fdfecaba1fea27559a238d58390c78ca2
Author: Zhongyue Luo <email address hidden>
Date: Sat Feb 8 16:29:35 2014 +0800

    Adds test condition in test_periodic

    There was no test case that covers a periodic_task not being called
    without "run_immediately" set. This patch also mocks time.time() due
    to the time gap between when the sciprts are loaded and when the tests
    are actually executed.

    Merged test_is_called and test_called_twice since the two cases overlap.

    Change-Id: I3385c8d313d58cbaad5c21347b2b7ce4a8aae6a7
    Related-Bug: #1275151
    Related-Bug: #1275152

Thierry Carrez (ttx)
Changed in oslo:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.