Comment 7 for bug 1220168

Revision history for this message
Aldrian Obaja (aldrian-math) wrote :

I conducted further tests, and found that the client actually stop receiving NOOP commands from the server. Attached is the sample run (this one is shorter than the previous one).

In this attachment, now I log when each worker receives a NOOP, NO_JOB, or JOB_ASSIGN from any server (unfortunately I couldn't log from which server the signal came from)

Notice that in line 188, that's the last time worker 01 receives job from server 4730.
And at the line 7552, that's the last time worker 01 receives job from any server. Note that there is no "Receive NOOP" from worker 01 past this line, although other workers are still receiving it.

I would really appreciate it if you can take a look at the python-distribution code also, to check whether each signal is handled correctly or whether there could be any race condition.