StorwizeSVCDriverTestCase.test_storwize_vdisk_copy_ops fails if there is a context switch to another green thread

Bug #1302670 reported by Roman Podoliaka
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
TaoBai

Bug Description

StorwizeSVCDriverTestCase.test_storwize_vdisk_copy_ops test case requires that there will be no context switches between green threads while it's being run.

This is achieved by monkey patching greenthread.sleep() function in setUp():

     self.sleeppatch = mock.patch('eventlet.greenthread.sleep')
     self.sleeppatch.start()

This seems to be a workaround in order to make the test pass, while it's hiding the actual reason of the problem. Any code causing the thread context switch in some other way will break the test (e.g. syncing the latest oslo.db code from oslo.incubator breaks this test, because it uses time.sleep() instead of eventlet.greenthread.sleep(), and the former is not monkey patched in the test).

Monkey patching of greenthread.sleep() in order to prevent thread context switches doesn't seem to be a good solution anyway as it is an assumption which is not true when Cinder is run in production.

description: updated
Mike Perez (thingee)
tags: added: drivers storwize
Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Agreed that this looks like dodgy code and should be fixed up.

Changed in cinder:
status: New → Triaged
status: Triaged → Confirmed
Revision history for this message
Jay Bryant (jsbryant) wrote :

I made the storwize developers aware of this.

TaoBai (baitao2020)
Changed in cinder:
assignee: nobody → Tao Bai (baitao2020)
Revision history for this message
TaoBai (baitao2020) wrote :

Hi Roman
Can you tell me the exactly step to reproduce the issue? I tried to remove .env, and re-builded new virtual environment. In devstack cinder folder, run command "./run_test.sh" The test cases still are all successful.

Looking for your instruction. BTW, if you already have any proposal to solve it, will be very appreciate .

summary: - StorwizeSVCDriverTestCase.test_storwize_vdisk_copy_ops depends fails if
- there is a context switch to another green thread
+ StorwizeSVCDriverTestCase.test_storwize_vdisk_copy_ops fails if there is
+ a context switch to another green thread
Revision history for this message
TaoBai (baitao2020) wrote :

Yes, but how to make that happen?

Revision history for this message
TaoBai (baitao2020) wrote :

I mean I need detail steps , sorry that, maybe it's silly question

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Sure, no problem. So the only thing you need is to let a thread context switch happen while this test is being run. The way I ran into this is that I updated common DB code from oslo-incubator which replaced all eventlet.greenthread.sleep() calls with time.sleep() calls (i.e. within cinder/openstack/common/db/ dir). So you could do that. Or you could disable monkey patching of sleep entirely here: https://github.com/openstack/cinder/blob/master/cinder/tests/test_storwize_svc.py#L1506

Just FYI, monkey patching both greenthread.sleep() and time.sleep() calls would work, but I'm not sure this the way we should solve this problem. IMO, this monkey patching was added only to make the test pass, while the real issue is in the code, which doesn't really cope well with context switches (and there will be context switches when this code is run in production).

So if you're familiar with this code/test, I'd recommend to dig a bit deeper, spend some time and eventually find out what is wrong with the code and why it fails on context switches.

Revision history for this message
Li Min Liu (liminliu) wrote :
Download full text (4.1 KiB)

reproduce step:
1. replace eventlet.greenthread.sleep with time.sleep
diff --git a/cinder/tests/test_storwize_svc.py b/cinder/tests/test_storwize_svc.py
index ea602d4..cd36e09 100644
--- a/cinder/tests/test_storwize_svc.py
+++ b/cinder/tests/test_storwize_svc.py
@@ -21,6 +21,7 @@ Tests for the IBM Storwize family and SVC volume driver.
 import mock
 import random
 import re
+import time

 from cinder import context
 from cinder import exception
@@ -1503,7 +1504,7 @@ class StorwizeSVCDriverTestCase(test.TestCase):
         self.driver.db = self.db
         self.driver.do_setup(None)
         self.driver.check_for_setup_error()
- self.sleeppatch = mock.patch('eventlet.greenthread.sleep')
+ self.sleeppatch = mock.patch('time.sleep')
         self.sleeppatch.start()
         self.driver._helpers.check_fcmapping_interval = 0

2. run unit test
./run_tests.sh -N test_storwize_svc

3. error log
======================================================================
FAIL: cinder.tests.test_storwize_svc.StorwizeSVCDriverTestCase.test_storwize_vdisk_copy_ops
----------------------------------------------------------------------
Traceback (most recent call last):
_StringException: Empty attachments:
  stderr
  stdout

pythonlogging:'': {{{
WWPN on node 1: ['500507680220C744', '500507680210C744']
WWPN on node 2: ['500507680220C745', '500507680230C745']
}}}

pythonlogging:''-1: {{{
found extension EntryPoint.parse('qpid = oslo.messaging._drivers.impl_qpid:QpidDriver')
found extension EntryPoint.parse('zmq = oslo.messaging._drivers.impl_zmq:ZmqDriver')
found extension EntryPoint.parse('kombu = oslo.messaging._drivers.impl_rabbit:RabbitDriver')
found extension EntryPoint.parse('rabbit = oslo.messaging._drivers.impl_rabbit:RabbitDriver')
found extension EntryPoint.parse('fake = oslo.messaging._drivers.impl_fake:FakeDriver')
found extension EntryPoint.parse('cinder.openstack.common.notifier.no_op_notifier = oslo.messaging.notify._impl_noop:NoOpDriver')
found extension EntryPoint.parse('cinder.openstack.common.notifier.log_notifier = oslo.messaging.notify._impl_log:LogDriver')
found extension EntryPoint.parse('cinder.openstack.common.notifier.test_notifier = oslo.messaging.notify._impl_test:TestDriver')
found extension EntryPoint.parse('cinder.openstack.common.notifier.rpc_notifier2 = oslo.messaging.notify._impl_messaging:MessagingV2Driver')
found extension EntryPoint.parse('cinder.openstack.common.notifier.rpc_notifier = oslo.messaging.notify._impl_messaging:MessagingDriver')
found extension EntryPoint.parse('nova.openstack.common.notifier.no_op_notifier = oslo.messaging.notify._impl_noop:NoOpDriver')
found extension EntryPoint.parse('nova.openstack.common.notifier.test_notifier = oslo.messaging.notify._impl_test:TestDriver')
found extension EntryPoint.parse('nova.openstack.common.notifier.rpc_notifier = oslo.messaging.notify._impl_messaging:MessagingDriver')
found extension EntryPoint.parse('nova.openstack.common.notifier.log_notifier = oslo.messaging.notify._impl_log:LogDriver')
found extension EntryPoint.parse('nova.openstack.common.notifier.rpc_notifier2 = oslo.messaging.notify._impl_messaging:MessagingV2Driver')
found extension EntryPoint.pa...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101470

Changed in cinder:
status: Confirmed → In Progress
Revision history for this message
TaoBai (baitao2020) wrote :

Reason:
1. test_storwize_svc._cmd_lsvdiskcopy is mocked as when first called, sync value is No, when call second time, value will be Yes
2._check_volume_copy_ops will call _cmd_lsvdiskcopy(). if Sync value is yes, it will call fun: _rm_vdisk_copy_op()
3.There is loopingcall is running _check_volume_copy_ops()
4. self.sleeppatch = mock.patch('eventlet.greenthread.sleep')
    self.sleeppatch.start() is used to stop the looping call.
    If remove this, the looping call will make this test case run twice _rm_vdisk_copy_op().
    But in _rm_vdisk_copy_op(), the KeyError is not catched which cause test case fail.
5.KeyError should be catched in
the code:
self.sleeppatch = mock.patch('eventlet.greenthread.sleep')
self.sleeppatch.start()
 should be removed.

https://review.openstack.org/#/c/101470/

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
Thierry Carrez (ttx) wrote :

Hi! The bug should stay "In progress" until the fix is actually merged.

Changed in cinder:
status: Fix Committed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/101470
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=0edfa2eaab08a0756bf3ec7818cdc9a8a379ba87
Submitter: Jenkins
Branch: master

commit 0edfa2eaab08a0756bf3ec7818cdc9a8a379ba87
Author: Tao Bai <email address hidden>
Date: Fri Jun 20 02:56:23 2014 -0700

    test_storwize_vdisk_copy_ops fails if green thread context switch

    There is loopingcall(green thread)will cause _rm_vdisk_copy_op() run
    twice.On the second time, the exception KeyError is not caught.

    Closes-Bug: #1302670

    Change-Id: I788f07c33f76693a13f0d581e687a32c6736a394

Changed in cinder:
status: In Progress → Fix Committed
Changed in cinder:
milestone: none → juno-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: juno-2 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.