Unit tests randomly consume all available memory

Bug #1300420 reported by David Shrewsbury
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Undecided
Adam Gandelman

Bug Description

I, and others, have seen random occurrences of the Ironic unit tests (py26 and py27) consuming all available memory before failing. This has been seen on both developer test machines, as well as in Jenkins:

http://logs.openstack.org/05/83105/3/check/gate-ironic-python26/dab25e8/

This seemed to start happening after I recently rebuilt the venv (with 'tox -r') during a test run. When this happens, you will see this process slowly eating up all available memory:

python -m subunit.run discover -t ./ ./ --load-list /tmp/tmpXXXXX

A discussion with the -infra folks led me to a similar issue that the Neutron team recently had:

http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2014-02-20.log
(see timestamp 2014-02-20T19:14:25)

Revision history for this message
David Shrewsbury (dshrews) wrote :

I'm able to reproduce by simply running this from the top-level ironic directory:

discover -t ./ -v

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :
Revision history for this message
Adam Gandelman (gandelman-a) wrote :

I am able to reproduce locally as well, using either discover or nosetests. Poking at it now.. gdb shows it hitting a logging deadlock while running periodic tasks for the conductor.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/86473

Changed in ironic:
assignee: nobody → Adam Gandelman (gandelman-a)
status: New → In Progress
Revision history for this message
Adam Gandelman (gandelman-a) wrote :

The conductor manager tests run early and currently start 55 new services and do not cleanup any of them. This results in runaway periodic tasks firing off later in the test suite that cause memory consumption and deadlock. By the time I hit the test suite locking up, ConductorManager.periodic_tasks() had run 562571 times and growing quickly. :) The proposed patch limits that to one run per service started and fixes the issue for me.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/86473
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=a4cf03c7e66a829400d69da295f54f5c2beb6812
Submitter: Jenkins
Branch: master

commit a4cf03c7e66a829400d69da295f54f5c2beb6812
Author: Adam Gandelman <email address hidden>
Date: Wed Apr 9 16:50:05 2014 -0700

    Cleanup running conductor services in tests

    The conductor manager tests are not stopping services that are
    started, instead relying strickly on database cleanup. This is
    causing condcutor periodic tasks to bleed into other test cases
    and cause occasional unrelated deadlocks in logging later on during
    test runs.

    Change-Id: I7502df5dec7c42fe1a20bebad8f9ad393572d17d
    Closes-bug: #1300420

Changed in ironic:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in ironic:
milestone: none → juno-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ironic:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.