2015-12-10 16:44:20 |
Victor Stinner |
bug |
|
|
added bug |
2015-12-12 05:51:21 |
Davanum Srinivas (DIMS) |
oslo.service: status |
New |
Fix Committed |
|
2016-07-04 11:07:39 |
ChangBo Guo(gcb) |
oslo.service: status |
Fix Committed |
Fix Released |
|
2016-09-13 11:13:03 |
Edward Hope-Morley |
bug task added |
|
python-oslo.service (Ubuntu) |
|
2016-09-13 11:15:08 |
Edward Hope-Morley |
bug task added |
|
cloud-archive |
|
2016-09-13 11:15:26 |
Edward Hope-Morley |
nominated for series |
|
cloud-archive/liberty |
|
2016-09-13 11:15:34 |
Edward Hope-Morley |
cloud-archive: status |
New |
Fix Released |
|
2016-09-13 11:15:57 |
Louis Bouchard |
nominated for series |
|
Ubuntu Yakkety |
|
2016-09-13 11:15:57 |
Louis Bouchard |
bug task added |
|
python-oslo.service (Ubuntu Yakkety) |
|
2016-09-13 11:15:57 |
Louis Bouchard |
nominated for series |
|
Ubuntu Xenial |
|
2016-09-13 11:15:57 |
Louis Bouchard |
bug task added |
|
python-oslo.service (Ubuntu Xenial) |
|
2016-09-13 11:15:57 |
Louis Bouchard |
nominated for series |
|
Ubuntu Wily |
|
2016-09-13 11:15:57 |
Louis Bouchard |
bug task added |
|
python-oslo.service (Ubuntu Wily) |
|
2016-09-13 11:16:15 |
Louis Bouchard |
python-oslo.service (Ubuntu Wily): status |
New |
Won't Fix |
|
2016-09-13 11:16:19 |
Louis Bouchard |
python-oslo.service (Ubuntu Xenial): status |
New |
Fix Released |
|
2016-09-13 11:16:24 |
Louis Bouchard |
python-oslo.service (Ubuntu Yakkety): status |
New |
Fix Released |
|
2016-09-13 11:35:02 |
Edward Hope-Morley |
description |
If the process launcher gets a SIGTERM signal, it calls _sigterm() to
handle it. This function calls SignalHandler() singleton to get the
instance of SignalHandler. This singleton acquires a lock to ensure
that the singleton is unique.
Problem arises when the process launcher gets a second SIGTERM while
the singleton lock (called 'singleton_lock') is locked. _sigterm() is
called again (reentrant call!), but we enter a dead lock. If eventlet
is used, eventlet fails on an assertion error: "Cannot switch to
MAINLOOP from MAINLOOP".
The bug can occurs with SIGTERM and SIGHUP signals.
I saw this issue with OpenStack services managed by systemd with a wrong configuration: SIGTERM is sent to all processes of the cgroups, instead of only sending the SIGTERM signal to the "main" process ("Main PID" in systemd). When the process launcher gets a SIGTERM, it sends a new SIGTERM signal to each child process. If systemd already sent a first SIGTERM to child processes, they now get two SIGTERM "shortly".
For OpenStack services managed by systemd, the service file must contain "KillMode=process" to only send SIGTERM to the main process ("Main PID"). |
[Impact]
* See bug description. We are seeing this in a Liberty production
environment and (at least) nova-conductor services are failing to
restart properly.
* this fix just missed the version of python-oslo.service we have in the
Liberty UCA so queueing up for backport
[Test Case]
* Start a service that has a high number of workers, check that all
are up then do a service stop (or killall -s SIGTERM nova-conductor)
and check that all workers/process are stopped.
[Regression Potential]
* none
If the process launcher gets a SIGTERM signal, it calls _sigterm() to
handle it. This function calls SignalHandler() singleton to get the
instance of SignalHandler. This singleton acquires a lock to ensure
that the singleton is unique.
Problem arises when the process launcher gets a second SIGTERM while
the singleton lock (called 'singleton_lock') is locked. _sigterm() is
called again (reentrant call!), but we enter a dead lock. If eventlet
is used, eventlet fails on an assertion error: "Cannot switch to
MAINLOOP from MAINLOOP".
The bug can occurs with SIGTERM and SIGHUP signals.
I saw this issue with OpenStack services managed by systemd with a wrong configuration: SIGTERM is sent to all processes of the cgroups, instead of only sending the SIGTERM signal to the "main" process ("Main PID" in systemd). When the process launcher gets a SIGTERM, it sends a new SIGTERM signal to each child process. If systemd already sent a first SIGTERM to child processes, they now get two SIGTERM "shortly".
For OpenStack services managed by systemd, the service file must contain "KillMode=process" to only send SIGTERM to the main process ("Main PID"). |
|
2016-09-13 11:35:15 |
Edward Hope-Morley |
summary |
Race condition in SIGTERM signal handler |
[SRU] Race condition in SIGTERM signal handler |
|
2016-09-13 11:36:02 |
James Page |
bug task added |
|
cloud-archive/liberty |
|
2016-09-13 11:37:56 |
Edward Hope-Morley |
cloud-archive/liberty: status |
New |
In Progress |
|
2016-09-13 11:37:58 |
Edward Hope-Morley |
cloud-archive/liberty: assignee |
|
Edward Hope-Morley (hopem) |
|
2016-09-13 12:11:43 |
Edward Hope-Morley |
tags |
|
sts sts-sru |
|
2016-09-13 12:12:01 |
Edward Hope-Morley |
attachment added |
|
lp1369465-trusty-liberty.debdiff https://bugs.launchpad.net/cloud-archive/+bug/1524907/+attachment/4739990/+files/lp1369465-trusty-liberty.debdiff |
|
2016-09-13 12:12:48 |
Edward Hope-Morley |
attachment removed |
lp1369465-trusty-liberty.debdiff https://bugs.launchpad.net/cloud-archive/+bug/1524907/+attachment/4739990/+files/lp1369465-trusty-liberty.debdiff |
|
|
2016-09-13 12:13:05 |
Edward Hope-Morley |
attachment added |
|
lp1524907-trusty-liberty.patch https://bugs.launchpad.net/cloud-archive/+bug/1524907/+attachment/4739991/+files/lp1524907-trusty-liberty.patch |
|
2016-09-13 12:15:20 |
Edward Hope-Morley |
attachment removed |
lp1524907-trusty-liberty.patch https://bugs.launchpad.net/cloud-archive/+bug/1524907/+attachment/4739991/+files/lp1524907-trusty-liberty.patch |
|
|
2016-09-13 12:15:46 |
Edward Hope-Morley |
attachment added |
|
lp1524907-trusty-liberty.patch https://bugs.launchpad.net/cloud-archive/+bug/1524907/+attachment/4739992/+files/lp1524907-trusty-liberty.patch |
|
2016-09-13 14:35:16 |
Edward Hope-Morley |
attachment removed |
lp1524907-trusty-liberty.patch https://bugs.launchpad.net/oslo.service/+bug/1524907/+attachment/4739992/+files/lp1524907-trusty-liberty.patch |
|
|
2016-09-13 14:35:35 |
Edward Hope-Morley |
attachment added |
|
lp1524907-trusty-liberty.patch https://bugs.launchpad.net/oslo.service/+bug/1524907/+attachment/4740096/+files/lp1524907-trusty-liberty.patch |
|
2016-09-13 14:35:57 |
Edward Hope-Morley |
cloud-archive/liberty: importance |
Undecided |
High |
|
2016-09-29 13:48:52 |
James Page |
cloud-archive/liberty: status |
In Progress |
Fix Committed |
|
2016-09-29 13:48:54 |
James Page |
tags |
sts sts-sru |
sts sts-sru verification-liberty-needed |
|
2016-10-06 10:48:14 |
Edward Hope-Morley |
tags |
sts sts-sru verification-liberty-needed |
sts sts-sru verification-liberty-done |
|
2016-10-21 13:43:26 |
Ryan Beisner |
cloud-archive/liberty: status |
Fix Committed |
Fix Released |
|
2016-11-09 12:17:09 |
Louis Bouchard |
tags |
sts sts-sru verification-liberty-done |
sts verification-liberty-done |
|