enable-ssh-admin.sh may fail on subsequent deployments

Bug #1766924 reported by Ben Nemec
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Ben Nemec

Bug Description

I've been seeing a number of failures to run enable-ssh-admin.sh since config-download was turned on by default. For example:

(undercloud) [centos@undercloud-test ~]$ openstack overcloud deploy --templates --libvirt-type qemu -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/enable-designate.yaml
Waiting for messages on queue 'tripleo' with no timeout.
Removing the current plan files
Uploading new plan files
Plan updated.
Processing templates in the directory /tmp/tripleoclient-vWwnBA/tripleo-heat-templates
Deploying templates in the directory /tmp/tripleoclient-vWwnBA/tripleo-heat-templates

 Stack overcloud/91c46539-a0f1-4a81-a679-1d5d4682cd28 CREATE_COMPLETE

Deploying overcloud configuration
/usr/share/openstack-tripleo-heat-templates/deployed-server/scripts/enable-ssh-admin.sh failed.

There are two problems with this: 1) The script failed and 2) there's no indication of why.

It turns out that 1) is explained by ssh host keys. On subsequent deploys, if the same IP address is selected for an overcloud there may already be a host key in known_hosts, and if it isn't removed first then enable-ssh-admin.sh will fail.

2) is because the info log level is off by default in tripleoclient, so the logging of the output of the script is swallowed. Due to the intermittent nature of 1) this makes debugging very difficult as a subsequent run with --debug may not fail.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/564273

Changed in tripleo:
assignee: nobody → Ben Nemec (bnemec)
status: Triaged → In Progress
Revision history for this message
Ben Nemec (bnemec) wrote :
Download full text (3.2 KiB)

For reference, the deploy output with the proposed patch looks like this:

(undercloud) [centos@undercloud-test ~]$ openstack overcloud deploy --templates --libvirt-type qemu -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/enable-designate.yaml
Waiting for messages on queue 'tripleo' with no timeout.
Removing the current plan files
Uploading new plan files
Plan updated.
Processing templates in the directory /tmp/tripleoclient-nOJIo8/tripleo-heat-templates
Deploying templates in the directory /tmp/tripleoclient-nOJIo8/tripleo-heat-templates

 Stack overcloud/7e385e35-ee0b-4511-846d-7a8ad07fd090 CREATE_COMPLETE

Deploying overcloud configuration
Starting workflow to create ssh admin on deployed servers.
SSH user: heat-admin
SSH key file: /home/centos/.ssh/id_rsa
Hosts: 9.1.1.5 9.1.1.16

Inserting TripleO short term key for 9.1.1.5
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:VKhdGz1XcspfpKbMy9r1q7MRe0vRDBdV36sqZfnSALE.
Please contact your system administrator.
Add correct host key in /home/centos/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/centos/.ssh/known_hosts:1
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
Inserting TripleO short term key for 9.1.1.16
Warning: Permanently added '9.1.1.16' (ECDSA) to the list of known hosts.
Starting ssh admin enablement workflow
id="4df7072b-9d09-4407-97d2-6e776d0b1321"
workflow_id="c92dbd76-2bee-427e-9ff0-728a1c8d301c"
workflow_name="tripleo.access.v1.enable_ssh_admin"
workflow_namespace=""
description="deployed server ssh admin creation"
task_execution_id="<none>"
state="RUNNING"
state_info="None"
created_at="2018-04-25 16:05:20"
updated_at="2018-04-25 16:05:20"
Waiting for the workflow execution to finish (id 4df7072b-9d09-4407-97d2-6e776d0b1321)....
Removing TripleO short term key from 9.1.1.5
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:VKhdGz1XcspfpKbMy9r1q7MRe0vRDBdV36sqZfnSALE.
Please contact your system administrator.
Add correct host key in /home/centos/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/centos/.ssh/known_hosts:1
ECDSA host key for 9.1.1.5 has changed and you have requested strict checking.
Host key verification failed.
/usr/share/openstack-tripleo-heat-temp...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/564281

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/570932

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.openstack.org/570932
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=30c41f03ce9f701d86ce4bc05596ed089c5365e2
Submitter: Zuul
Branch: stable/queens

commit 30c41f03ce9f701d86ce4bc05596ed089c5365e2
Author: Ben Nemec <email address hidden>
Date: Wed Apr 25 11:35:06 2018 -0500

    Disable StrictHostKeyChecking when removing keys too

    Otherwise subsequent deploys can fail if an IP is reused.

    -o UserKnownHostsFile=/dev/null is also added to avoid ugly messages
    in the script output because of mismatched hostkeys that we're going
    to ignore anyway.

    Change-Id: If7eabd5df2ffc29d125c4f65f7fee1b0f4085817
    Closes-Bug: 1766924
    (cherry picked from commit 0edc4783486a16a8902a06ecdd69fc3ef3498671)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/564281
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=0edc4783486a16a8902a06ecdd69fc3ef3498671
Submitter: Zuul
Branch: master

commit 0edc4783486a16a8902a06ecdd69fc3ef3498671
Author: Ben Nemec <email address hidden>
Date: Wed Apr 25 11:35:06 2018 -0500

    Disable StrictHostKeyChecking when removing keys too

    Otherwise subsequent deploys can fail if an IP is reused.

    -o UserKnownHostsFile=/dev/null is also added to avoid ugly messages
    in the script output because of mismatched hostkeys that we're going
    to ignore anyway.

    Change-Id: If7eabd5df2ffc29d125c4f65f7fee1b0f4085817
    Closes-Bug: 1766924

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.0.3

This issue was fixed in the openstack/tripleo-heat-templates 8.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 9.0.0.0b3

This issue was fixed in the openstack/tripleo-heat-templates 9.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-tripleoclient (master)

Change abandoned by Ben Nemec (<email address hidden>) on branch: master
Review: https://review.openstack.org/564273
Reason: This was superseded by the conversion to Python.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.