the task, "Check the dns information provided by the virthost" fails

Bug #1701292 reported by Honza Pokorny
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
wes hayutin

Bug Description

During a devmode run against rdo cloud:

$ bash devmode.sh --no-gate --ovb

TASK [undercloud-deploy : Check the dns information provided by the virthost] **
task path: /home/hpokorny/.quickstart/usr/local/share/ansible/roles/undercloud-deploy/tasks/create-scripts.yml:3
Thursday 29 June 2017 11:51:44 -0300 (0:00:00.110) 0:21:55.873 *********
fatal: [undercloud]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host localhost port 22: Connection refused\r\n", "unreachable": true}

PLAY RECAP *********************************************************************
localhost : ok=26 changed=18 unreachable=0 failed=0
undercloud : ok=51 changed=29 unreachable=1 failed=0

Tags: quickstart
Honza Pokorny (hpokorny)
Changed in tripleo:
importance: Undecided → High
Revision history for this message
wes hayutin (weshayutin) wrote :

It looks like the connection to rdo-cloud is not quite fast enough.
Please try w/ https://review.openstack.org/#/c/471886/ which adjusts the ansible config and the ssh settings.

Revision history for this message
Honza Pokorny (hpokorny) wrote :

Even with the retries, the issue still occurs.

Revision history for this message
Ronelle Landy (rlandy) wrote :

The dns server in the defaults is out of commission.
You'll need the dns servers specified in https://review.openstack.org/#/c/474781/

Also, ssh_args = -o ServerAliveInterval=30 is required if ssh is timing out.

Revision history for this message
wes hayutin (weshayutin) wrote :

This is an issue w/ rdo-cloud atm..

Everyone is either hitting timeouts or other ssh errors.

Examples:

ssh: connect to host localhost port 22: Connection refused\r\n", "unreachable": true

 UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.\r\nPermission denied (publickey,gssapi-keyex,gssapi-with-mic,password).\r\n", "unreachable": true}

fatal: [undercloud]: FAILED! => {"failed": true, "msg": "Timeout (12s) waiting for privilege escalation prompt: "}

Changed in tripleo:
status: New → Invalid
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

hint: using the ansible.cfg with

[default]
timeout = 90

helps to mitigate those "Timeout (12s) waiting for privilege escalation prompt: "

Revision history for this message
wes hayutin (weshayutin) wrote :

The error you hit was related to the network, however there is also a bug sitting behind that task as well. It is resolved w/ https://review.openstack.org/#/c/478273/

wes hayutin (weshayutin)
Changed in tripleo:
status: Invalid → Confirmed
assignee: nobody → wes hayutin (weshayutin)
summary: - devmode - undercloud unreachable during dns check
+ the task, "Check the dns information provided by the virthost" fails
Changed in tripleo:
milestone: none → pike-3
Revision history for this message
Ronelle Landy (rlandy) wrote :

https://review.openstack.org/#/c/478273/ has merged.

We have a periodic job running now to check devmode on OVB.
With setting'ssh_args = -o ServerAliveInterval=30' in ansible.cfg, devmode on OVB has been passing since 07/01:
https://thirdparty-logs.rdoproject.org/jenkins-tq-gate-devmode-master-ovb-rdocloud-public-bond-32/

Revision history for this message
Ronelle Landy (rlandy) wrote :

Confirmed with honza that devmode on OVB is working now:
<honza> rlandy: yes, definitely working now
closing this out

Changed in tripleo:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.