Master TLS job failed while deploying overcloud with Could not find versioned identity endpoints when attempting to authenticate

Bug #1861782 reported by chandan kumar
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Cédric Jeanneret

Bug Description

Master TLS job aka FS039 is failing at overcloud deploy with following error at step 4

https://logserver.rdoproject.org/53/703953/4/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/cd6cb15/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

2020-02-02 10:33:25 | TASK [tripleo_keystone_resources : Create default domain] **********************
2020-02-02 10:33:25 | Sunday 02 February 2020 10:33:23 +0000 (0:00:00.291) 0:37:00.727 *******
2020-02-02 10:33:25 | fatal: [undercloud]: FAILED! => {
2020-02-02 10:33:25 | "changed": false,
2020-02-02 10:33:25 | "rc": 1
2020-02-02 10:33:25 | }
2020-02-02 10:33:25 |
2020-02-02 10:33:25 | MSG:
2020-02-02 10:33:25 |
2020-02-02 10:33:25 | MODULE FAILURE
2020-02-02 10:33:25 | See stdout/stderr for the exact error
2020-02-02 10:33:25 |
2020-02-02 10:33:25 |
2020-02-02 10:33:25 | MODULE_STDERR:
2020-02-02 10:33:25 |
2020-02-02 10:33:25 | No handlers could be found for logger "keystoneauth.identity.generic.base"
2020-02-02 10:33:25 | Traceback (most recent call last):
2020-02-02 10:33:25 | File "<stdin>", line 114, in <module>
2020-02-02 10:33:25 | File "<stdin>", line 106, in _ansiballz_main
2020-02-02 10:33:25 | File "<stdin>", line 49, in invoke_module
2020-02-02 10:33:25 | File "/tmp/ansible_os_keystone_domain_payload_4SSG1u/__main__.py", line 185, in <module>
2020-02-02 10:33:25 | File "/tmp/ansible_os_keystone_domain_payload_4SSG1u/__main__.py", line 145, in main
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/openstack/cloud/_identity.py", line 883, in search_domains
2020-02-02 10:33:25 | return self.list_domains(**filters)
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/openstack/cloud/_identity.py", line 856, in list_domains
2020-02-02 10:33:25 | data = self._identity_client.get(
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/openstack/cloud/_identity.py", line 32, in _identity_client
2020-02-02 10:33:25 | 'identity', min_version=2, max_version='3.latest')
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/openstack/cloud/openstackcloud.py", line 406, in _get_versioned_client
2020-02-02 10:33:25 | if adapter.get_endpoint():
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 282, in get_endpoint
2020-02-02 10:33:25 | return self.session.get_endpoint(auth or self.auth, **kwargs)
2020-02-02 10:33:25 | File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 1225, in get_endpoint
2020-02-02 10:33:29 | return auth.get_endpoint(self, **kwargs)Overcloud configuration failed.
2020-02-02 10:33:29 |
2020-02-02 10:33:29 | File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint
2020-02-02 10:33:29 | allow_version_hack=allow_version_hack, **kwargs)
2020-02-02 10:33:29 | File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 271, in get_endpoint_data
2020-02-02 10:33:29 | service_catalog = self.get_access(session).service_catalog
2020-02-02 10:33:29 | File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
2020-02-02 10:33:29 | self.auth_ref = self.get_auth_ref(session)
2020-02-02 10:33:29 | File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 206, in get_auth_ref
2020-02-02 10:33:29 | self._plugin = self._do_create_plugin(session)
2020-02-02 10:33:29 | File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 161, in _do_create_plugin
2020-02-02 10:33:29 | 'auth_url is correct. %s' % e)
2020-02-02 10:33:29 | keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Unable to establish connection to https://overcloud.ooo.test:13000: HTTPSConnectionPool(host='overcloud.ooo.test', port=13000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f53665dbc50>: Failed to establish a new connection: [Errno -2] Name or service not known',))
2020-02-02 10:33:29 |

This happens because the undercloud does not have the overcloud ip in its /etc/hosts and the freeIPA does not have a record for it. so the public endpoint in the undercloud will not have the domain name of the overcloud

/etc/hosts entry is generally added during post overcloud deploy.sh script. It needs to be fixed.

Changed in tripleo:
assignee: nobody → Cédric Jeanneret (cjeanner)
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Currently testing a patch that injects the hosts at the beginning of the deploy - it's a modification of the tht/common/deploy-steps.j2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/705634

Changed in tripleo:
status: Confirmed → In Progress
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

So it needs only one entry in the /etc/hosts: the public endpoint and its VIP.

We need to rework a bit my patch, especially since it will conflict with some cleanup at some point, introduced by https://review.opendev.org/#/c/704451/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/706242

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/706318

wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-2 → ussuri-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/706242
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=4697301965351558331787a819ae60040885ba0a
Submitter: Zuul
Branch: master

commit 4697301965351558331787a819ae60040885ba0a
Author: Cédric Jeanneret <email address hidden>
Date: Thu Feb 6 09:43:46 2020 +0100

    Also use enabled_networks

    Since we need to generate /etc/hosts on the undercloud as well[1], we
    have to use the enabled_networks in addition to role_networks, else we might
    lack some entries on the overcloud nodes, namely the internalapi part.

    It also makes the generated file a bit less "full of blank", using the
    "-" flag in some specific places.

    [1] https://review.opendev.org/705634

    Change-Id: Ie942e25ed3501cd4988597ea5dd8d1a17eceec67
    Related-Bug: #1861782

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/705634
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=96c40f89beb80de611a4ea75f28b8b43007d0ac6
Submitter: Zuul
Branch: master

commit 96c40f89beb80de611a4ea75f28b8b43007d0ac6
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 4 09:40:28 2020 +0100

    Generate /etc/hosts early on both under and overcloud

    Prior to this patch, the /etc/hosts was generated only on the overcloud
    nodes, leading to some issues when it comes to TLS-Everywhere, as raised
    in associated bug.

    Depends-On: https://review.opendev.org/706242
    Change-Id: I836ab1a23c8aea35c0cea54d0765c7313a4b9038
    Closes-Bug: 1861782

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-ansible (master)

Change abandoned by Luke Short (<email address hidden>) on branch: master
Review: https://review.opendev.org/706318

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Apparently, this issue also happen on Train - starting the backport process.

tags: added: train-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/712617

Revision history for this message
Abdallah Yasin (abdysn) wrote :

Hi, This bug started to show on train

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/712619

Revision history for this message
Abdallah Yasin (abdysn) wrote :

logs of the error in train: http://paste.openstack.org/show/790574/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-ansible (stable/train)

Change abandoned by Cédric Jeanneret (Tengu) (<email address hidden>) on branch: stable/train
Review: https://review.opendev.org/712617
Reason: so apparently the code added by this patch is already existing, more or less, and we can't just backport it "as is".

It might be possible master is, once again, broken and the overcloud nodes might lack the internalapi parts again...

Revision history for this message
Harald Jensås (harald-jensas) wrote :
Revision history for this message
Abdallah Yasin (abdysn) wrote :

the wanted record is: "10.0.0.5 overcloud.localdomain"

Revision history for this message
Harald Jensås (harald-jensas) wrote :

>
> the wanted record is: "10.0.0.5 overcloud.localdomain"
>

Which is present as either "10.0.0.5 overcloud.localdomain" or "10.0.0.5 overcloud.ooo.test" in Compute, Controller and Undercloud in both master and train jobs. afict?

Revision history for this message
Harald Jensås (harald-jensas) wrote :

https://review.opendev.org/703953

tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039 SUCCESS in 3h 03m 57s (non-voting)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/712619
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=8b8194b17145cdf62b5f34ad20bfd5dd359e77bc
Submitter: Zuul
Branch: stable/train

commit 8b8194b17145cdf62b5f34ad20bfd5dd359e77bc
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 4 09:40:28 2020 +0100

    Generate /etc/hosts early on both under and overcloud

    Prior to this patch, the /etc/hosts was generated only on the overcloud
    nodes, leading to some issues when it comes to TLS-Everywhere, as raised
    in associated bug.

    Change-Id: I836ab1a23c8aea35c0cea54d0765c7313a4b9038
    Closes-Bug: 1861782
    (cherry picked from commit 96c40f89beb80de611a4ea75f28b8b43007d0ac6)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.2.0

This issue was fixed in the openstack/tripleo-heat-templates 12.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.