[tripleo] Provide a tag to the container that will be used to kill it

Bug #1991000 reported by Rodolfo Alonso
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Wishlist
Rodolfo Alonso
tripleo
New
Undecided
Unassigned

Bug Description

TripleO uses containers to spawn the different processes. Some of these processes (some Neutron agents) also spawn long live child processes that run in parallel to the main one. This is the list of them:
* dibbler
* dnsmasq
* haproxy
* keepalived
* neutron-keepalived-state-change
* radvd

TripleO uses a set of scripts that replaces those processes. When Neutron call a script, it actually starts a sidecar container running the needed process. When the agent needs to stop the process, there is a kill script [1] that replaces the "kill" CLI call. This kill script uses the PID of the process to find the container ID and then to send the needed signal (hup, term, kill).

To find the container ID, the script reads "/proc/$PID/cgroup" and parses the output. This is a weak method that depends on the output of this file.

This bug proposes to spawn the containers with a label:
  $ podman run --label neutron_tag="container_UUID"

This container UUID could be the "ProcessManager.uuid" itself. This UUID will be unique and will identify the container. If passed when created and killed, the kill script can use this UUID to find this specific container:
  $ podman ps --filter "label=neutron_tag=container_UUID"

[1]https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/neutron/kill-script

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Small note: in order to pass the container_UUID to the script running the actual container service[1], we can use environment variables; doing so, we won't break standard service startup, and the wrapper scripts may get the data directly.

This would also allow to *name* the container using the UUID[2] - using the UUID instead of the NETNS may be easier, and may allow to remove some intelligence from the script[3]

[1] https://opendev.org/openstack/puppet-tripleo/src/branch/master/templates/neutron
[2] for instance: https://opendev.org/openstack/puppet-tripleo/src/branch/master/templates/neutron/haproxy.epp#L17
[3] https://opendev.org/openstack/puppet-tripleo/src/branch/master/templates/neutron/haproxy.epp#L41

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Talking to Brent Eagles, he told me to include a safe fallback to this new feature. In case the container is not found, execute a "kill -kill $PID" on the host to kill the process that is running the sidecar container.

That will ensure that this process/container is killed.

Changed in neutron:
importance: Undecided → Low
importance: Low → Wishlist
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/865018

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/865018
Committed: https://opendev.org/openstack/neutron/commit/3d575f8bd066ce2eb46353a49a8c6850ba9e4387
Submitter: "Zuul (22348)"
Branch: master

commit 3d575f8bd066ce2eb46353a49a8c6850ba9e4387
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Nov 14 05:26:52 2022 +0100

    Add an env variable "PROCESS_TAG" in ``ProcessManager``

    Added a new environment variable "PROCESS_TAG" in ``ProcessManager``.
    This environment variable could be read by the process executed and
    is unique per process. This environment variable can be used to tag
    the running process; for example, a container manager can use this
    tag to mark the a container.

    This feature will be used by TripleO to identify the running containers
    with a unique tag. This will make the "kill" process easier; it will
    be needed just to find the container running with this tag.

    Closes-Bug: #1991000
    Change-Id: I234c661720a8b1ceadb5333181890806f79dc21a

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.0.0.0rc1

This issue was fixed in the openstack/neutron 22.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.