systemd sync script for sidecar containers is unable to spawn new processes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Daniel Alvarez |
Bug Description
Currently, the sync script searches for running processes and if the target one is not running, it'll start it.
This is the logic:
IFS=$'\n'
for LINE in $(cat {{ tripleo_
NETNS=$(echo $LINE | awk '{ print $1 }')
IFS=$' ' ARGS=$(echo $LINE | sed -e "s|$NETNS ||" | xargs)
# TODO(emilien) investigate if we should rather run docker/podman ps instead of ps on the host
if ! ps -e -o pid,command | grep "$(echo $NETNS | sed 's|^[^-]*\-||')" | grep -v grep &> /dev/null; then
fi
done
However, the command itself invoked by the Neutron agent may still show up in the 'ps' output which makes the sync script to think it's running and hence skip its start. This may delay the execution of the sidecar container until the next iteration (1 minute) which in the case of the metadata container (for both ML2/OVS and ML2/OVN) it may be already too late as the cloud-init of the instance had given up.
Example of the 'ps' output when the issue is being hit:
Mar 19 11:12:31 compute-0 sync[92924]: 92914 /usr/bin/python3 /usr/bin/
Changed in tripleo: | |
importance: | Undecided → Critical |
milestone: | none → ussuri-3 |
tags: | added: train-backport-potential |
Fix proposed to branch: master /review. opendev. org/713852
Review: https:/