On the Neutron side, I plan to work again in this, as I found that once killed, a RPC worker is not respawned (current devstack master)
Killing one of the RPC processes:
Jan 23 12:40:29 bionic neutron-server[28154]: INFO oslo_service.service [-] Child 28170 killed by signal 9
Jan 23 12:40:29 bionic neutron-server[28154]: WARNING oslo_service.service [-] pid 28170 not in child list
No new process spawned
After that kill, stopping devstack@q-svc process, we see the wsgi processes quitting properly:
Jan 23 12:41:37 bionic neutron-server[28154]: INFO oslo_service.service [-] Waiting on 2 children to exit
Jan 23 12:41:37 bionic neutron-server[28154]: INFO neutron.wsgi [-] (28169) wsgi exited, is_accepting=True
Jan 23 12:41:37 bionic neutron-server[28154]: INFO neutron.wsgi [-] (28168) wsgi exited, is_accepting=True
Jan 23 12:41:37 defiant-bionic neutron-server[28154]: INFO oslo_service.service [-] Child 28169 exited with status 0
Jan 23 12:41:37 defiant-bionic neutron-server[28154]: INFO oslo_service.service [-] Child 28168 exited with status 0
Jan 23 12:41:37 bionic neutron-server[28154]: INFO oslo_service.service [-] Wait called after thread killed. Cleaning up.
Jan 23 12:41:37 bionic neutron-server[28154]: DEBUG neutron.service [-] calling RpcWorker stop() {{(pid=28154) stop /opt/stack/neutron/neutron/service.py:138}}
Jan 23 12:41:37 bionic neutron-server[28154]: DEBUG oslo_service.service [-] Killing children. {{(pid=28154) stop /usr/local/lib/python3.6/dist-packages/oslo_service/service.py:704}}
Jan 23 12:41:37 defiant-bionic neutron-server[28154]: INFO oslo_service.service [-] Waiting on 2 children to exit
The "surviving" other RPC process quits properly, though we have alarm status because of the previously killed one:
Jan 23 12:41:43 bionic neutron-server[28154]: INFO oslo_service.service [-] Child 28171 exited with status 0
Jan 23 12:42:37 bionic systemd[1]: <email address hidden>: Main process exited, code=killed, status=14/ALRM
Jan 23 12:42:37 bionic systemd[1]: <email address hidden>: Failed with result 'signal'.
On the Neutron side, I plan to work again in this, as I found that once killed, a RPC worker is not respawned (current devstack master)
Killing one of the RPC processes: server[ 28154]: INFO oslo_service. service [-] Child 28170 killed by signal 9 server[ 28154]: WARNING oslo_service. service [-] pid 28170 not in child list
Jan 23 12:40:29 bionic neutron-
Jan 23 12:40:29 bionic neutron-
No new process spawned server[ 28154]: INFO oslo_service. service [-] Waiting on 2 children to exit server[ 28154]: INFO neutron.wsgi [-] (28169) wsgi exited, is_accepting=True server[ 28154]: INFO neutron.wsgi [-] (28168) wsgi exited, is_accepting=True server[ 28154]: INFO oslo_service. service [-] Child 28169 exited with status 0 server[ 28154]: INFO oslo_service. service [-] Child 28168 exited with status 0 server[ 28154]: INFO oslo_service. service [-] Wait called after thread killed. Cleaning up. server[ 28154]: DEBUG neutron.service [-] calling RpcWorker stop() {{(pid=28154) stop /opt/stack/ neutron/ neutron/ service. py:138} } server[ 28154]: DEBUG oslo_service. service [-] Killing children. {{(pid=28154) stop /usr/local/ lib/python3. 6/dist- packages/ oslo_service/ service. py:704} } server[ 28154]: INFO oslo_service. service [-] Waiting on 2 children to exit
After that kill, stopping devstack@q-svc process, we see the wsgi processes quitting properly:
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 defiant-bionic neutron-
Jan 23 12:41:37 defiant-bionic neutron-
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 bionic neutron-
Jan 23 12:41:37 defiant-bionic neutron-
The "surviving" other RPC process quits properly, though we have alarm status because of the previously killed one: server[ 28154]: INFO oslo_service. service [-] Child 28171 exited with status 0
Jan 23 12:41:43 bionic neutron-
Jan 23 12:42:37 bionic systemd[1]: <email address hidden>: Main process exited, code=killed, status=14/ALRM
Jan 23 12:42:37 bionic systemd[1]: <email address hidden>: Failed with result 'signal'.