oslo.service ProcessLauncher failed to stop child processes

Bug #1793830 reported by Tao Liu on 2018-09-21
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
oslo.service
Undecided
Unassigned

Bug Description

When the child process detects that its parent process has dies unexpected (via kill -9) it does a graceful shut down. Depending on the state of the process, it was observed that sometimes the graceful shutdown never completed.

The _pipe_watcher should support graceful shutdown timeout and add alarm watching to ensure graceful_shutdown_timeout is not exceeded.

Example below, kill -9 <parent pid> and one of child processes never terminates.
ps -ef | grep "dcorch-api-proxy .*--type compute" | grep -v grep
root 25520 1 12 20:28 ? 00:00:01 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute
root 25757 25520 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute
root 25758 25520 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute

sudo kill -9 25520

018-09-21 20:39:17.799 25757 INFO oslo_service.service [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Parent process has died unexpectedly, exiting
2018-09-21 20:39:17.800 25757 INFO oslo.service.wsgi [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Stopping WSGI server.
2018-09-21 20:39:17.800 25757 INFO eventlet.wsgi.server [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] (25757) wsgi exited, is_accepting=True
2018-09-21 20:39:17.802 25758 INFO oslo_service.service [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Parent process has died unexpectedly, exiting
2018-09-21 20:39:17.803 25758 INFO oslo.service.wsgi [req-43c0707d-c834-4b6d-95c6-51608c9d6e84 - - - - -] Stopping WSGI server.

ps -ef | grep "dcorch-api-proxy .*--type compute" | grep -v grep
root 25758 1 0 20:29 ? 00:00:00 /usr/bin/python2 /usr/bin/dcorch-api-proxy --config-file=/etc/dcorch/dcorch.conf --type compute

This problem is observed in the following environment:
Version:OpenStack pike version
oslo-service: 1.16.1-1.el7

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers