Comment 2 for bug 1803731

Revision history for this message
Zane Bitter (zaneb) wrote : Re: py3: test_child_signal_sighup takes ~500 times longer to run on py3

The fix for bug 1788022 broke the fix for bug 1705047 - we check for the presence of the select.poll() function to avoid adding EINTR exceptions on Windows, but neglect to take into account the fact that eventlet monkey-patches the select module with its own version that also does not include the poll() function. So the extra handling to interrupt system calls that was supposed to be added in Python 3.5 and later was disabled - at least in the unit tests, and probably in real use in most cases, though possibly depending on the order of imports and such.

This, combined with a programming error in the unit test which causes it not to wait for anything to happen after sending SIGHUP to a child worker, caused the unit test to take 60s. This is because the cleanup code for the test sends SIGTERM to the service process immediately, and the child also receives the SIGTERM before it has had the chance to process the SIGHUP (since it is still sleeping and the sleep cannot be interrupted to schedule the handler greenthread due to PEP475).

The _reload_service() handler clears all signal handler callbacks, with the result that when the SIGTERM handler is scheduled (next time the timer task runs, since that is the next time that eventlet awakes from sleep), it doesn't do anything. The parent process waits for the child to gracefully shutdown and finally exits with SIGALRM after graceful_shutdown_timeout, which is <drum roll> 60s.