Comment 2 for bug 1840355

Revision history for this message
melanie witt (melwitt) wrote :

Just saw another hit of this bug and I can see in the logs in this particular case, the problem was likely that sshd wasn't ready by the time the SSH connectivity check began:

2021-02-10 10:19:24.111 | OpenSSH_7.6p1 Ubuntu-4ubuntu0.3, OpenSSL 1.0.2n 7 Dec 2017
2021-02-10 10:19:24.111 | debug1: Reading configuration data /etc/ssh/ssh_config
2021-02-10 10:19:24.111 | debug1: /etc/ssh/ssh_config line 19: Applying options for *
2021-02-10 10:19:24.112 | debug1: Connecting to 172.24.5.232 [172.24.5.232] port 22.
2021-02-10 10:19:24.113 | debug1: connect to address 172.24.5.232 port 22: Connection refused
2021-02-10 10:19:24.113 | ssh: connect to host 172.24.5.232 port 22: Connection refused

because it says "Connection refused" [1].

Anecdotally, I've seen VMs take several minutes before they were SSH-able, but we have to put a limit on how long we'll wait for sshd to be up and running, of course.

First attempt was at:

2021-02-10 10:18:55.971 | + /opt/stack/new/grenade/projects/70_cinder/resources.sh:create:187 : timeout 30 ssh -v -o ConnectTimeout=10 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /opt/stack/save/cinder_key.pem cirros@172.24.5.232 'echo '\''I am a teapot'\'' > verify.txt'

about 30 seconds prior to the final attempt and looks like the code [2] reflects that as well:

    local timeleft=30
    while [[ $timeleft -gt 0 ]]; do

[1] https://zuul.opendev.org/t/openstack/build/3c3c82a666884b9c8e5a2f9d74e2e32a/log/controller/logs/grenade.sh_log.txt#1817-1822
[2] https://github.com/openstack/grenade/blob/1cdbb71ffe1f41fbe46694888aa8ba7d2d7917c7/projects/70_cinder/resources.sh#L180-L181