Undercloud ssh check timeout might be too short

Bug #1649272 reported by Jiří Stránský
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo-quickstart
Fix Released
Low
Jiří Stránský

Bug Description

I just encountered an error during a quickstart run -- timing out on ssh wait. I'm not sure what went wrong as in general i don't think i have a particularly slow virt host, and when i checked, ssh was available and i managed to log in just fine via `ssh -i /var/tmp/containers/id_rsa_undercloud root@192.168.23.35`. This makes me think that the timeout of 5 minutes might be just a bit too short.

TASK [setup/undercloud : Wait until ssh is available on undercloud node] *******
task path: /root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:227
Monday 12 December 2016 13:44:08 +0100 (0:00:00.048) 0:31:03.003 *******
fatal: [hostname.sanitized.example.org]: FAILED! => {"changed": false, "elapsed": 300, "failed": true, "msg": "Timeout when waiting for 192.168.23.35:22"}

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
hostname.sanitized.example.org : ok=127 changed=65 unreachable=0 failed=1
localhost : ok=10 changed=5 unreachable=0 failed=0

Monday 12 December 2016 13:49:09 +0100 (0:05:00.997) 0:36:04.001 *******
===============================================================================
setup/undercloud : convert image -------------------------------------- 812.01s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/convert_image.yml:17
setup/undercloud : run update ----------------------------------------- 593.84s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/update_image.yml:14
setup/undercloud : Wait until ssh is available on undercloud node ----- 301.00s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:227
setup/undercloud : Get image ------------------------------------------ 110.26s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/fetch_image.yml:81
parts/libvirt : Install packages for libvirt --------------------------- 44.46s
/root/quickstart_containers/tripleo-quickstart/roles/parts/libvirt/tasks/main.yml:30
setup/undercloud : Perform selinux relabel on undercloud image --------- 41.49s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:134
setup/undercloud : Inject additional images ---------------------------- 36.81s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:60
setup/undercloud : Upload undercloud volume to storage pool ------------ 35.54s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:176
setup/undercloud : Get image ------------------------------------------- 34.89s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/fetch_image.yml:81
setup/undercloud : Get undercloud vm ip address ------------------------ 22.62s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:214
setup/undercloud : upload repos on the images -------------------------- 14.50s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/inject_repos.yml:53
setup/undercloud : Inject undercloud ssh public key to appliance -------- 4.75s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:92
setup/undercloud : download delorean repos for master ------------------- 4.45s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/inject_repos.yml:4
setup/undercloud : Resize the undercloud image using qemu-image resize --- 3.72s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/convert_image.yml:12
setup/overcloud : Define overcloud vms ---------------------------------- 3.37s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/overcloud/tasks/main.yml:82
setup/undercloud : generate image specific update script ---------------- 2.88s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/update_image.yml:8
setup/undercloud : Get actual md5 checksum of image --------------------- 2.56s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/fetch_image.yml:92
setup/user : Generate ssh keys ------------------------------------------ 2.51s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/user/tasks/main.yml:19
parts/libvirt : If ipxe-roms-qemu is not installed, install a known good version --- 2.50s
/root/quickstart_containers/tripleo-quickstart/roles/parts/libvirt/tasks/main.yml:20
setup/undercloud : Copy instackenv.json to appliance -------------------- 2.32s
/root/quickstart_containers/tripleo-quickstart/roles/libvirt/setup/undercloud/tasks/main.yml:74

Revision history for this message
Jiří Stránský (jistr) wrote :

This happened to me twice in successsion, then bumping the timeout from 5 to 10 minutes seemed to fix it, i'll post a patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart (master)

Fix proposed to branch: master
Review: https://review.openstack.org/409813

Changed in tripleo-quickstart:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart (master)

Reviewed: https://review.openstack.org/409813
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart/commit/?id=f49bc86ffef8ad4b6985c3e3957d20f6a5d0342d
Submitter: Jenkins
Branch: master

commit f49bc86ffef8ad4b6985c3e3957d20f6a5d0342d
Author: Jiri Stransky <email address hidden>
Date: Mon Dec 12 15:30:04 2016 +0100

    Increase undercloud ssh timeout from 5 to 10 minutes

    I encountered an error during a quickstart run twice in succession --
    time out on waiting for undercloud VM ssh to become available. I managed
    to log in just fine via `ssh -i $WORKDIR/id_rsa_undercloud
    root@$UNDERCLOUD_IP`. This makes me think that the timeout of 5 minutes
    might be just a bit too short for some environments. Doubling the
    timeout to 10 minutes got quickstart past this step on my environment.

    Change-Id: I1a74d996579368cbffa2dfb3a9cab45c0f365a5e
    Closes-Bug: #1649272

Changed in tripleo-quickstart:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-quickstart 2.0.0

This issue was fixed in the openstack/tripleo-quickstart 2.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.