After the controller is first unlocked, the controller manifest is failing to apply. The failure happens because the docker daemon cannot be started:
2020-04-08T06:55:57.588 [0;36mDebug: 2020-04-08 06:55:57 +0000 Exec[perform systemctl daemon reload for docker proxy](provider=posix): Executing 'systemctl daemon-reload'[0m 2020-04-08T06:55:57.590 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: 'systemctl daemon-reload'[0m 2020-04-08T06:55:57.592 [mNotice: 2020-04-08 06:55:57 +0000 /Stage[main]/Platform::Docker::Config/Exec[perform systemctl daemon reload for docker proxy]: Triggered 'refresh' from 1 events[0m 2020-04-08T06:55:57.594 [0;32mInfo: 2020-04-08 06:55:57 +0000 /Stage[main]/Platform::Docker::Config/Exec[perform systemctl daemon reload for docker proxy]: Scheduling refresh of Service[docker][0m 2020-04-08T06:55:57.596 [0;36mDebug: 2020-04-08 06:55:57 +0000 /Stage[main]/Platform::Docker::Config/Exec[perform systemctl daemon reload for docker proxy]: The container Class[Platform::Docker::Config] will propagate my refresh event[0m 2020-04-08T06:55:57.598 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl is-active docker'[0m 2020-04-08T06:55:57.600 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl is-enabled docker'[0m 2020-04-08T06:55:57.602 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl unmask docker'[0m 2020-04-08T06:55:57.643 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl start docker'[0m 2020-04-08T06:55:57.816 [0;36mDebug: 2020-04-08 06:55:57 +0000 Runing journalctl command to get logs for systemd start failure: journalctl -n 50 --since '5 minutes ago' -u docker --no-pager[0m 2020-04-08T06:55:57.819 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: 'journalctl -n 50 --since '5 minutes ago' -u docker --no-pager'[0m 2020-04-08T06:55:57.828 [1;31mError: 2020-04-08 06:55:57 +0000 Systemd start for docker failed!
Looking at daemon.log, it shows dockerd and containerd being restarted, but then dockerd gets stuck with the following logs coming out 1000s of times:
2020-04-08T06:55:57.472 controller-0 dockerd[100339]: info time="2020-04-08T06:55:57.472875804Z" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=moby 2020-04-08T06:55:57.472 controller-0 dockerd[100339]: info time="2020-04-08T06:55:57.472870255Z" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
The containerd/dockerd startup was modified recently: https://review.opendev.org/#/c/716911 https://review.opendev.org/#/c/715593 https://review.opendev.org/#/c/717044
Paul or Bob should take a look at this LP. Note that the puppet exec that is failing is only hit if an http_proxy/https_proxy is configured. I'm not sure if that makes a difference?
After the controller is first unlocked, the controller manifest is failing to apply. The failure happens because the docker daemon cannot be started:
2020-04- 08T06:55: 57.588 [0;36mDebug: 2020-04-08 06:55:57 +0000 Exec[perform systemctl daemon reload for docker proxy]( provider= posix): Executing 'systemctl daemon-reload'[0m 08T06:55: 57.590 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: 'systemctl daemon-reload'[0m 08T06:55: 57.592 [mNotice: 2020-04-08 06:55:57 +0000 /Stage[ main]/Platform: :Docker: :Config/ Exec[perform systemctl daemon reload for docker proxy]: Triggered 'refresh' from 1 events[0m 08T06:55: 57.594 [0;32mInfo: 2020-04-08 06:55:57 +0000 /Stage[ main]/Platform: :Docker: :Config/ Exec[perform systemctl daemon reload for docker proxy]: Scheduling refresh of Service[docker][0m 08T06:55: 57.596 [0;36mDebug: 2020-04-08 06:55:57 +0000 /Stage[ main]/Platform: :Docker: :Config/ Exec[perform systemctl daemon reload for docker proxy]: The container Class[Platform: :Docker: :Config] will propagate my refresh event[0m 08T06:55: 57.598 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl is-active docker'[0m 08T06:55: 57.600 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl is-enabled docker'[0m 08T06:55: 57.602 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl unmask docker'[0m 08T06:55: 57.643 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: '/usr/bin/systemctl start docker'[0m 08T06:55: 57.816 [0;36mDebug: 2020-04-08 06:55:57 +0000 Runing journalctl command to get logs for systemd start failure: journalctl -n 50 --since '5 minutes ago' -u docker --no-pager[0m 08T06:55: 57.819 [0;36mDebug: 2020-04-08 06:55:57 +0000 Executing: 'journalctl -n 50 --since '5 minutes ago' -u docker --no-pager'[0m 08T06:55: 57.828 [1;31mError: 2020-04-08 06:55:57 +0000 Systemd start for docker failed!
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
2020-04-
Looking at daemon.log, it shows dockerd and containerd being restarted, but then dockerd gets stuck with the following logs coming out 1000s of times:
2020-04- 08T06:55: 57.472 controller-0 dockerd[100339]: info time="2020- 04-08T06: 55:57.472875804 Z" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd /containerd. sock: connect: connection refused\"" module= libcontainerd namespace=moby 08T06:55: 57.472 controller-0 dockerd[100339]: info time="2020- 04-08T06: 55:57.472870255 Z" level=error msg="failed to get event" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd /containerd. sock: connect: connection refused\"" module= libcontainerd namespace= plugins. moby
2020-04-
The containerd/dockerd startup was modified recently: /review. opendev. org/#/c/ 716911 /review. opendev. org/#/c/ 715593 /review. opendev. org/#/c/ 717044
https:/
https:/
https:/
Paul or Bob should take a look at this LP. Note that the puppet exec that is failing is only hit if an http_proxy/ https_proxy is configured. I'm not sure if that makes a difference?