Comment 4 for bug 509015

Revision history for this message
Vishal Vatsa (vvatsa) wrote : Re: [Bug 509015] [NEW] ipcluster does not start all the engines

2010/1/18 Brian Granger <email address hidden>:
> I think I know what the issue is here.  We have found that sometimes
> the engines startup
> so fast that the controller is not yet up and running.  The engine
> that try to connect
> before the controller is running fail.  Twisted is fully capable of
> handling many simultaneous
> connections, so I don't think it is that.
>

Yep this sound about correct.
Though it should not happen as, there is a _delay_start method which
waits for the furl files to be created on disk, which should be the test
that ipcontroller is up and running.

Toby, which OS are you on? could the filesystem semantics be different?
(In theory, it should work on windows/cygwin but I have never tested it)

I have not been able to replicate this so far on NAS backed cluster.
If this continues to be an issue for you, I can try to give you a patch
to insert a delay in the ssh engine start.

Regards,
-vishal