Since 18.8.2022 the job consistently fails during Check Keystone public endpoint status step, after all 30 attempts are exhausted.
The keystone container logs don't contain errors. However around the same time the "error: kex_exchange_identification: Connection closed by remote host" appears in the journal on the undercloud.
Analogous issue can be observed on periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-component-master-validation
Trace:
------
2022-08-18 19:57:01.330006 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 6 retries left
2022-08-18 19:57:08.874973 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 5 retries left
2022-08-18 19:57:14.046036 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 4 retries left
2022-08-18 19:57:19.201638 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 3 retries left
2022-08-18 19:57:24.350340 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 2 retries left
2022-08-18 19:57:29.508238 | fa163e3e-f896-e3f5-81db-0000000034d7 | WAITING | Check Keystone public endpoint status | undercloud | 1 retries left
2022-08-18 19:57:34.666224 | fa163e3e-f896-e3f5-81db-0000000034d7 | FATAL | Check Keystone public endpoint status | undercloud | item=neutron | error={"ansible_job_id": "345769691811.194096", "ansible_loop_var": "tripleo_keystone_resources_endpoint_async_result_item", "attempts": 30, "changed": false, "finished": 0, "results_file": "/root/.ansible_async/345769691811.194096", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": [], "tripleo_keystone_resources_endpoint_async_result_item": {"ansible_job_id": "345769691811.194096", "ansible_loop_var": "tripleo_keystone_resources_data", "changed": true, "failed": 0, "finished": 0, "results_file": "/root/.ansible_async/345769691811.194096", "started": 1, "tripleo_keystone_resources_data": {"key": "neutron", "value": {"endpoints": {"admin": "http://192.168.24.3:9696", "internal": "http://192.168.24.3:9696", "public": "http://192.168.24.3:9696"}, "region": "regionOne", "service": "network", "users": {"neutron": {"password": "zZWQAqqm4VQlQdSUmidoLxQvO", "roles": ["admin", "service"]}}}}}}
Logs:
-----
https://logserver.rdoproject.org/openstack-component-validation/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-component-master-validation/25a8020/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
https://logserver.rdoproject.org/openstack-component-validation/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-validation-master/ed5ad21/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
https://logserver.rdoproject.org/openstack-component-validation/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-component-master-validation/25a8020/logs/undercloud/var/log/extra/journal_errors.txt.gz
Logs of neutron container of the multinode job contain multiple error raised by keystone.
Trace: exceptions. discovery. DiscoveryFailur e: Unable to find a version discovery document at http:// 192.168. 24.3:6385, the service is unavailable or misconfigured. Required version range (any - any), version hack disabled.
------
keystoneauth1.
Log: /logserver. rdoproject. org/openstack- component- validation/ opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 9-ovb-3ctlr_ 1comp-featurese t001-component- master- validation/ 25a8020/ logs/undercloud /var/log/ extra/errors. txt.gz
----
https:/