Just saw another hit of this bug and I can see in the logs in this particular case, the problem was likely that sshd wasn't ready by the time the SSH connectivity check began:
2021-02-10 10:19:24.111 | OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
2021-02-10 10:19:24.111 | debug1: Reading configuration data /etc/ssh/ssh_config
2021-02-10 10:19:24.111 | debug1: /etc/ssh/ssh_config line 19: Applying options for *
2021-02-10 10:19:24.112 | debug1: Connecting to 172.24.5.232 [172.24.5.232] port 22.
2021-02-10 10:19:24.113 | debug1: connect to address 172.24.5.232 port 22: Connection refused
2021-02-10 10:19:24.113 | ssh: connect to host 172.24.5.232 port 22: Connection refused
because it says "Connection refused" [1].
Anecdotally, I've seen VMs take several minutes before they were SSH-able, but we have to put a limit on how long we'll wait for sshd to be up and running, of course.
Just saw another hit of this bug and I can see in the logs in this particular case, the problem was likely that sshd wasn't ready by the time the SSH connectivity check began:
2021-02-10 10:19:24.111 | OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
2021-02-10 10:19:24.111 | debug1: Reading configuration data /etc/ssh/ssh_config
2021-02-10 10:19:24.111 | debug1: /etc/ssh/ssh_config line 19: Applying options for *
2021-02-10 10:19:24.112 | debug1: Connecting to 172.24.5.232 [172.24.5.232] port 22.
2021-02-10 10:19:24.113 | debug1: connect to address 172.24.5.232 port 22: Connection refused
2021-02-10 10:19:24.113 | ssh: connect to host 172.24.5.232 port 22: Connection refused
because it says "Connection refused" [1].
Anecdotally, I've seen VMs take several minutes before they were SSH-able, but we have to put a limit on how long we'll wait for sshd to be up and running, of course.
First attempt was at:
2021-02-10 10:18:55.971 | + /opt/stack/ new/grenade/ projects/ 70_cinder/ resources. sh:create: 187 : timeout 30 ssh -v -o ConnectTimeout=10 -o UserKnownHostsF ile=/dev/ null -o StrictHostKeyCh ecking= no -i /opt/stack/ save/cinder_ key.pem cirros@172.24.5.232 'echo '\''I am a teapot'\'' > verify.txt'
about 30 seconds prior to the final attempt and looks like the code [2] reflects that as well:
local timeleft=30
while [[ $timeleft -gt 0 ]]; do
[1] https:/ /zuul.opendev. org/t/openstack /build/ 3c3c82a666884b9 c8e5a2f9d74e2e3 2a/log/ controller/ logs/grenade. sh_log. txt#1817- 1822 /github. com/openstack/ grenade/ blob/1cdbb71ffe 1f41fbe46694888 aa8ba7d2d7917c7 /projects/ 70_cinder/ resources. sh#L180- L181
[2] https:/