Comment 2 for bug 517358

Revision history for this message
Brian Granger (ellisonbg) wrote : Re: [Bug 517358] [NEW] Using multiprocessing module crashes parallel iPython

Are you using multiprocessing or just subprocess?

Brian

On Thu, Feb 4, 2010 at 5:14 PM, Justin MacCallum
<email address hidden> wrote:
> Public bug reported:
>
> I have a parallel scientific code that runs on clusters. It currently
> uses my own (badly designed) parallel communication engine and I'm
> trying to transition to iPython's TaskClient interface. One part of my
> code uses the subprocess module to wrap a call to a different piece of
> software. Unfortunately, this seems to be causing problems for me. I'm
> running on OS X 10.6.2, with python 2.6.4 and ipython 0.10 as supplied
> by MacPorts. The following code snippet will reproduce the problem I'm
> having.
>
> -----
> #!/usr/bin/env python
>
> from IPython.kernel import client
>
> tc = client.TaskClient()
>
> @tc.parallel()
> def remote_test(input):
>   import subprocess
>   # I'm obviously not wrapping ls, but I have the same problem with the real binary I'm trying to call
>   subprocess.check_call('ls')
>   return input
>
> work = range(100)
>
> results = remote_test(work)
> -----
>
>
> The output of this program is:
>
> -----
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead
>  import sha
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/clientconnector.py:43: DeprecationWarning: Importing class Tub directly from 'foolscap' is deprecated since Foolscap 0.4.3. Please import foolscap.api.Tub instead
>  self.tub = Tub()
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/taskfc.py:79: DeprecationWarning: Importing class Referenceable1 directly from 'foolscap' is deprecated since Foolscap 0.4.3. Please import foolscap.api.Referenceable instead
>  class FCTaskControllerFromTaskController(Referenceable):
> Traceback (most recent call last):
>  File "./test.py", line 15, in <module>
>   results = remote_test(work)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/parallelfunction.py", line 104, in call_function
>   return self.mapper.map(self.func, *sequences)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/mapper.py", line 230, in map
>   task_results = [self.task_controller.get_task_result(tid) for tid in task_ids]
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/taskclient.py", line 93, in get_task_result
>   taskid, block)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/twistedutil.py", line 72, in blockingCallFromThread
>   return twisted.internet.threads.blockingCallFromThread(reactor, f, *a, **kw)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/internet/threads.py", line 114, in blockingCallFromThread
>   result.raiseException()
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/python/failure.py", line 326, in raiseException
>   raise self.type, self.value, self.tb
> OSError: [Errno 4] Interrupted system call
> -----
>
>
> The ipcontroller log file looks like:
>
> -----
> 2010-02-04 12:42:36-0800 [-] Running task 82 on worker 1
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Task completed: 82
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] distributing Tasks
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Running task 83 on worker 1
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Task 83 failed on worker 1
> 2010-02-04 12:42:36-0800 [-] distributing Tasks
> 2010-02-04 12:42:36-0800 [-] Running task 84 on worker 0
> 2010-02-04 12:42:36-0800 [Negotiation,0,192.168.0.194] Task 84 failed on worker 0
> 2010-02-04 12:42:37-0800 [-] distributing Tasks
> 2010-02-04 12:42:37-0800 [-] Running task 85 on worker 1
> 2010-02-04 12:42:37-0800 [Negotiation,2,192.168.0.194] Task 85 failed on worker 1
> 2010-02-04 12:42:37-0800 [-] distributing Tasks
> 2010-02-04 12:42:37-0800 [-] Running task 86 on worker 0
> 2010-02-04 12:42:37-0800 [Negotiation,0,192.168.0.194] Task 86 failed on worker 0
> 2010-02-04 12:42:38-0800 [-] distributing Tasks
> 2010-02-04 12:42:38-0800 [-] Running task 87 on worker 1
> -----
>
> Notice that a few of the tasks actually complete, while the majority
> fail with the strange interrupted system call error.
>
> ** Affects: ipython
>     Importance: Undecided
>         Status: New
>
> --
> Using multiprocessing module crashes parallel iPython
> https://bugs.launchpad.net/bugs/517358
> You received this bug notification because you are a member of IPython
> Developers, which is subscribed to IPython.
>
> Status in IPython - Enhanced Interactive Python: New
>
> Bug description:
> I have a parallel scientific code that runs on clusters. It currently uses my own (badly designed) parallel communication engine and I'm trying to transition to iPython's TaskClient interface. One part of my code uses the subprocess module to wrap a call to a different piece of software. Unfortunately, this seems to be causing problems for me. I'm running on OS X 10.6.2, with python 2.6.4 and ipython 0.10 as supplied by MacPorts. The following code snippet will reproduce the problem I'm having.
>
> -----
> #!/usr/bin/env python
>
> from IPython.kernel import client
>
> tc = client.TaskClient()
>
> @tc.parallel()
> def remote_test(input):
>   import subprocess
>   # I'm obviously not wrapping ls, but I have the same problem with the real binary I'm trying to call
>   subprocess.check_call('ls')
>   return input
>
> work = range(100)
>
> results = remote_test(work)
> -----
>
>
> The output of this program is:
>
> -----
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/python/filepath.py:12: DeprecationWarning: the sha module is deprecated; use the hashlib module instead
>  import sha
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/clientconnector.py:43: DeprecationWarning: Importing class Tub directly from 'foolscap' is deprecated since Foolscap 0.4.3. Please import foolscap.api.Tub instead
>  self.tub = Tub()
> /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/taskfc.py:79: DeprecationWarning: Importing class Referenceable1 directly from 'foolscap' is deprecated since Foolscap 0.4.3. Please import foolscap.api.Referenceable instead
>  class FCTaskControllerFromTaskController(Referenceable):
> Traceback (most recent call last):
>  File "./test.py", line 15, in <module>
>   results = remote_test(work)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/parallelfunction.py", line 104, in call_function
>   return self.mapper.map(self.func, *sequences)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/mapper.py", line 230, in map
>   task_results = [self.task_controller.get_task_result(tid) for tid in task_ids]
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/taskclient.py", line 93, in get_task_result
>   taskid, block)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/IPython/kernel/twistedutil.py", line 72, in blockingCallFromThread
>   return twisted.internet.threads.blockingCallFromThread(reactor, f, *a, **kw)
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/internet/threads.py", line 114, in blockingCallFromThread
>   result.raiseException()
>  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-macosx-10.6-i386.egg/twisted/python/failure.py", line 326, in raiseException
>   raise self.type, self.value, self.tb
> OSError: [Errno 4] Interrupted system call
> -----
>
>
>
> The ipcontroller log file looks like:
>
> -----
> 2010-02-04 12:42:36-0800 [-] Running task 82 on worker 1
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Task completed: 82
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] distributing Tasks
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Running task 83 on worker 1
> 2010-02-04 12:42:36-0800 [Negotiation,2,192.168.0.194] Task 83 failed on worker 1
> 2010-02-04 12:42:36-0800 [-] distributing Tasks
> 2010-02-04 12:42:36-0800 [-] Running task 84 on worker 0
> 2010-02-04 12:42:36-0800 [Negotiation,0,192.168.0.194] Task 84 failed on worker 0
> 2010-02-04 12:42:37-0800 [-] distributing Tasks
> 2010-02-04 12:42:37-0800 [-] Running task 85 on worker 1
> 2010-02-04 12:42:37-0800 [Negotiation,2,192.168.0.194] Task 85 failed on worker 1
> 2010-02-04 12:42:37-0800 [-] distributing Tasks
> 2010-02-04 12:42:37-0800 [-] Running task 86 on worker 0
> 2010-02-04 12:42:37-0800 [Negotiation,0,192.168.0.194] Task 86 failed on worker 0
> 2010-02-04 12:42:38-0800 [-] distributing Tasks
> 2010-02-04 12:42:38-0800 [-] Running task 87 on worker 1
> -----
>
> Notice that a few of the tasks actually complete, while the majority fail with the strange interrupted system call error.
>
>
>

--
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
<email address hidden>
<email address hidden>