OpenStack Compute (nova)

Bug #1357578
Comment #13

Comment 13 for bug 1357578

Revision history for this message

Sean Dague (sdague) wrote on 2014-09-18:

#13

I managed to get a reproduce by creating a slow vm: ubuntu 14.04 in vbox, 1 G ram, 2 vcpu, set to 50% of cpu performance.

tox -epy27 -- --until-fail multiprocess

On the 3rd time through I got the following:

running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ./nova/tests --load-list /tmp/tmp7YX5uh
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ./nova/tests --load-list /tmp/tmpU5Qsw_
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_killed_worker_recover [5.688803s] ... ok

{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_killed_worker_recover [2.634592s] ... ok
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_restart_sighup [1.565492s] ... ok
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_terminate_sigterm [2.400319s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_restart_sighup [160.043131s] ... FAILED
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_terminate_sigkill [2.317150s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_terminate_sigterm [2.274788s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_terminate_sigkill [2.089225s] ... ok

and.... hang

So, testr is correctly killing the restart test when it times out. It is also correctly moving on to additional tests. However it is then in a hung state and can't finish once the tests are done.

Why the test timed out, I don't know. However the fact that testr is going crazy is an issue all by itself.

I managed to get a reproduce by creating a slow vm: ubuntu 14.04 in vbox, 1 G ram, 2 vcpu, set to 50% of cpu performance.

tox -epy27 -- --until-fail multiprocess

On the 3rd time through I got the following:

running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ./nova/tests  --load-list /tmp/tmp7YX5uh
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ./nova/tests  --load-list /tmp/tmpU5Qsw_
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_killed_worker_recover [5.688803s] ... ok

Captured stderr:
~~~~~~~~~~~~~~~~
    /home/sdague/nova/.tox/py27/local/lib/python2.7/site-packages/sqlalchemy/sql/default_comparator.py:35: SAWarning: The IN-predicate on "instances.uuid" was invoked with an empty sequence. This results in a contradiction, which nonetheless can be expensive to evaluate.  Consider alternative strategies for improved performance.
      return o[0](self, self.expr, op, *(other + o[1:]), **kwargs)
    
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_killed_worker_recover [2.634592s] ... ok
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_restart_sighup [1.565492s] ... ok
{0} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_terminate_sigterm [2.400319s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_restart_sighup [160.043131s] ... FAILED
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_terminate_sigkill [2.317150s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITest.test_terminate_sigterm [2.274788s] ... ok
{1} nova.tests.integrated.test_multiprocess_api.MultiprocessWSGITestV3.test_terminate_sigkill [2.089225s] ... ok

and.... hang

So, testr is correctly killing the restart test when it times out. It is also correctly moving on to additional tests. However it is then in a hung state and can't finish once the tests are done.

Why the test timed out, I don't know. However the fact that testr is going crazy is an issue all by itself.