oslo.service

Bug #1524907
Comment #0

Comment 0 for bug 1524907

Revision history for this message

Victor Stinner (vstinner) wrote on 2015-12-10: Race condition in SIGTERM signal handler

If the process launcher gets a SIGTERM signal, it calls _sigterm() to
handle it. This function calls SignalHandler() singleton to get the
instance of SignalHandler. This singleton acquires a lock to ensure
that the singleton is unique.

Problem arises when the process launcher gets a second SIGTERM while
the singleton lock (called 'singleton_lock') is locked. _sigterm() is
called again (reentrant call!), but we enter a dead lock. If eventlet
is used, eventlet fails on an assertion error: "Cannot switch to
MAINLOOP from MAINLOOP".

The bug can occurs with SIGTERM and SIGHUP signals.

I saw this issue with OpenStack services managed by systemd with a wrong configuration: SIGTERM is sent to all processes of the cgroups, instead of only sending the SIGTERM signal to the "main" process ("Main PID" in systemd). When the process launcher gets a SIGTERM, it sends a new SIGTERM signal to each child process. If systemd already sent a first SIGTERM to child processes, they now get two SIGTERM "shortly".

For OpenStack services managed by systemd, the service file must contain "KillMode=process" to only send SIGTERM to the main process ("Main PID").