Tests hang frequently in XenAPIVMTestCase.test_parallel_builds

Bug #831599 reported by Soren Hansen on 2011-08-22
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Unassigned

Bug Description

I can kill it, start over, and then it usually works.

It seems to be busy-waiting on something . strace shows a bunch of threads waiting on futexes and a single one calling epoll_wait a *lot*. I'm attaching the last 1000 lines of run_tests_log.

Related branches

Soren Hansen (soren) wrote :
Alex Meade (alex-meade) wrote :

Seems to me that this is when the test should be failing, however it doesn't assert anything and just hangs instead. Perhaps just add a timeout where it's calling .wait() and fail if it doesn't in time? Seems risky to me, and that also means we need to make it work.

I say nuke the test!

Thierry Carrez (ttx) on 2011-08-24
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Brian Lamar (blamar) wrote :

Adding eventlet.monkey_patch() to the top of the test file seemed to squash this for me, but I'm hesitant to say definitively because it's not consistently reproducible.

Brian Lamar (blamar) wrote :

Nevermind, I was just getting lucky.

Ewan Mellor (ewanmellor) wrote :

I bet it's this (a similar backtrace is in Soren's log):

2011-08-28 11:01:52,770 DEBUG nova.virt.xenapi.fake [-] Calling VM.start <bound
method FakeSessionForVMTests.VM_start of <nova.tests.xenapi.stubs.FakeSessionFor
VMTests object at 0x1077cb4c>> from (pid=9342) callit /opt/jenkins/workspace/uni
t-nova/upstream/nova/virt/xenapi/fake.py:431
2011-08-28 11:01:52,772 DEBUG nova.virt.xenapi.vmops [-] Starting instance 2 fro
m (pid=9342) _start /opt/jenkins/workspace/unit-nova/upstream/nova/virt/xenapi/v
mops.py:134
Traceback (most recent call last):
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/hubs/hub.py", line 336, in fire_timers
    timer()
2011-08-28 11:01:52,773 DEBUG nova.virt.xenapi.fake [-] Calling VM.start <bound
method FakeSessionForVMTests.VM_start of <nova.tests.xenapi.stubs.FakeSessionFor
VMTests object at 0x1077cb4c>> from (pid=9342) callit /opt/jenkins/workspace/uni
t-nova/upstream/nova/virt/xenapi/fake.py:431
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/semaphore.py", line 95, in _do_acquire
    waiter.switch()
error: cannot switch to a different thread

Ewan Mellor (ewanmellor) wrote :

The traceback from Soren's log:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 336, in fire_timers
2011-08-22 21:58:29,483 DEBUG nova.virt.xenapi.fake [-] Calling host.call_plugin <bound method FakeSessionForVMTests.host_call_plugin of <nova.tests.xenapi.stubs.FakeSessionForVMTests object at 0x1365f490>> from (pid=4594) callit /home/soren/src/openstack/nova/virt-layer-cleanup2/nova/virt/xenapi/fake.py:428
    timer()
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/eventlet/semaphore.py", line 95, in _do_acquire
    waiter.switch()
error: cannot switch to a different thread

Changed in nova:
status: Confirmed → Fix Committed
Thierry Carrez (ttx) on 2011-09-09
Changed in nova:
milestone: none → diablo-rbp
Thierry Carrez (ttx) on 2011-09-22
Changed in nova:
milestone: diablo-rbp → 2011.3
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments