This issue has popped up a few times. Some observations: - Many systemd services are shutting down in parallel, including syslog.service. - After syslog.service (alias: syslog-ng.service) shuts down, we lose any further shutdown logs. - Can see different behavior in general for controllers vs worker nodes since there are different distribution of services and pods running. We reach syslog.service stopped sooner on workers than controllers. - There is a missing service dependency on 'syslog.service' for services that we need to be able support/debug. - To guarantee getting containerization cleanup logs during shutdown, we need to enforce systemd dependency "After:syslog.service" for both kubelet.service and containerd.service. This will make syslog.service shutdown after containers. - The shutdown order for containerized services is not ideal. Kubelet requires containerd and etcd to function properly. There can be log floods of errors in kubelet if etcd not providing service. To improve gracefulness of containerization on shutdown, we should add the following dependency for kubelet.service: "After=containerd.service etcd.service" (OR, specify the equivalent Before=kubelet.service) - The k8s-containerd-cleanup script can take too long to run, we don't always see the final line, "k8s-container-cleanup(127949): info : Stopping all container completed." - the debugging of containers is very difficult since we often only have a container ID and nothing else identifying (eg, pod, container name, namespace, etc). Log scraping can be challenging without easier cross-referencing. Recommendations: * Improve K8S systemd service order and logging, append to the following 3 files. ./stx/config-files/containerd-config/*/containerd-stx-override.conf (i.e, /etc/systemd/system/containerd.service.d/containerd-stx-override.conf ) After=syslog.service ./stx-puppet/puppet-manifests/src/modules/platform/templates/kube-stx-override.conf.erb (i.e., /etc/systemd/system/kubelet.service.d/kube-stx-override.conf ) After=containerd.service etcd.service After=syslog.service docker-stx-override.conf : After=syslog.service Alternatively, make equivalent kube-stx-override.conf dependencies that do not require puppet-manifest generation: containerd-stx-override.conf: After=syslog.service Before=kubelet.service etcd-override.conf: After=syslog.service Before=kubelet.service * Change the k8s-containerd-cleanup script to shutdown containers in parallel. * Improve the support/debugging of Kubernetes - add one-liner log details during the k8s-container-cleanup script. Include more specific identifying info per container during shutdown. Here is suggested prototype (this info provided by 'crictl inspect'): 2023-05-08T16:31:26.000 compute-0 k8s-container-cleanup(63687): info : pid: 27858 cgroupsPath: /k8s-infra/kubepods/besteffort/podf52c2b5d-8856-4948-b4fa-773aa3b2e568/4501d53cc43629d6a61b24bc9431e0e49565989ac69f9adee021a4aa1bfb31a8 id: 4501d53cc43629d6a61b24bc9431e0e49565989ac69f9adee021a4aa1bfb31a8 container.name: kube-proxy pod.name: kube-proxy-w7w8s pod.namespace: kube-system pod.uid: f52c2b5d-8856-4948-b4fa-773aa3b2e568 logPath: /var/log/pods/kube-system_kube-proxy-w7w8s_f52c2b5d-8856-4948-b4fa-773aa3b2e568/kube-proxy/37.log state: CONTAINER_RUNNING