oslo.service ProcessLauncher failed to stop child processes

Bug #1793830 reported by Tao Liu
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
oslo.service
New
Undecided
Unassigned

Bug Description

When the child process detects that its parent process has dies unexpected (via kill -9) it does a graceful shut down. Depending on the state of the process, it was observed that sometimes the graceful shutdown never completed.

The _pipe_watcher should support graceful shutdown timeout and add alarm watching to ensure graceful_shutdown_timeout is not exceeded.

Example below, kill -9 <parent pid> and one of child processes never terminates.
ps -ef | grep "dcorch-api-proxy .*--type compute" | grep -v grep
root 25520 1 12 20:28 ? 00:00:01 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute
root 25757 25520 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute
root 25758 25520 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute

sudo kill -9 25520

018-09-21 20:39:17.799 25757 INFO oslo_service.service [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Parent process has died unexpectedly, exiting
2018-09-21 20:39:17.800 25757 INFO oslo.service.wsgi [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Stopping WSGI server.
2018-09-21 20:39:17.800 25757 INFO eventlet.wsgi.server [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] (25757) wsgi exited, is_accepting=True
2018-09-21 20:39:17.802 25758 INFO oslo_service.service [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Parent process has died unexpectedly, exiting
2018-09-21 20:39:17.803 25758 INFO oslo.service.wsgi [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Stopping WSGI server.

ps -ef | grep "dcorch-api-proxy .*--type compute" | grep -v grep
root 25758 1 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute

This problem is observed in the following environment:
Version:OpenStack pike version
oslo-service: 1.16.1-1.el7

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.