Termination signals not handled correctly in case of several ProcessLauncher instances in one process

Bug #1432995 reported by Elena Ezhova
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Elena Ezhova
oslo-incubator
Fix Released
Undecided
Elena Ezhova

Bug Description

Neutron server has api and rpc workers and when their number is configured to be non-zero each worker is launched using ProcessLauncher from oslo-incubator's service.py. It is important to note that different instances of ProcessLauncher are used for launching api and rpc workers. [1], [2]

When ProcessLauncher is initialized, among else it setups handlers for termination signals (SIGHUP, SIGTERM and SIGINT) [3]. It is known that only one signal handler can be installed per signal and only the latest installed handler will be active. So, if several ProcessLauncher instances are being initialized in the same process then only handlers of the last instance will be triggered on receiving a signal.

The consequence is that when neutron-server is running with non-zero number of api and rpc workers sending a parent process SIGHUP would result in reset method being called only for rpc workers.

The possible solution is to store all handlers in a class attribute and redefine handle_signal so that it calls all handlers one by one.
An alternative is to inherit from ProcessLauncher in neutron and redefine signal handling there.

[1] https://github.com/openstack/neutron/blob/e933891462408435c580ad42ff737f8bff428fbc/neutron/service.py#L159
[2] https://github.com/openstack/neutron/blob/e933891462408435c580ad42ff737f8bff428fbc/neutron/wsgi.py#L237
[3] https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L210

Elena Ezhova (eezhova)
Changed in oslo-incubator:
assignee: nobody → Elena Ezhova (eezhova)
Changed in neutron:
status: New → Opinion
assignee: nobody → Elena Ezhova (eezhova)
Changed in oslo-incubator:
status: New → Confirmed
Elena Ezhova (eezhova)
Changed in oslo-incubator:
status: Confirmed → In Progress
Revision history for this message
Elena Ezhova (eezhova) wrote :

Review for oslo-incubator: https://review.openstack.org/#/c/164993

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

The incubator patch linked in comment #1 has merged, but I don't think it solves the problem completely. It *looks* to me and Dims that the signal handling code in the launchers unregisters itself after the first invocation. We would like to have some verification than sending multiple signals to the same process causes the handlers to be called repeatedly before we mark the ticket as fixed.

Revision history for this message
Elena Ezhova (eezhova) wrote :

I have tested this patch with neutron and no matter how many times I send SIGHUP to parent process, logs show that all registered handlers are called. Please see attached logs [1] - it can be seen that "Caught SIGHUP, stopping children" is logged twice and both api and rpc workers are restarted (I added processes' pids in log format so that it would easier to see who is who).

[1] http://paste.openstack.org/show/193144/

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Thanks for confirming Elena!

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

Great, thanks for double-checking Elena. We'll call this done for oslo-incubator then.

Changed in oslo-incubator:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in oslo-incubator:
milestone: none → 2015.1.0
status: Fix Committed → Fix Released
Elena Ezhova (eezhova)
Changed in neutron:
status: Opinion → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.