testenv-client is failing with error "Couldn't retrieve env" on rdo cloud jobs

Bug #1728033 reported by Attila Darazs
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Gabriele Cerami

Bug Description

Our periodic OVB jobs are failing with

2017-10-27 12:05:15.176 | +(/opt/stack/new/tripleo-ci/toci_gate_test.sh:224): ./testenv-client -b 192.168.103.254:4730 -t 13800 --envsize 4 --ucinstance 018fb204-ab50-45fc-a716-0419e06072f1 --net-iso multi-nic -- ./toci_quickstart.sh
2017-10-27 12:09:16.988 | +(/opt/stack/new/tripleo-ci/toci_gate_test.sh:218): sleep 1200
2017-10-27 12:09:16.988 | 2017-10-27 12:05:15,175 - testenv-client - INFO - Received job : Couldn't retrieve env
2017-10-27 12:09:16.988 | 2017-10-27 12:05:15,175 - testenv-client - ERROR - Couldn't retrieve env
2017-10-27 12:09:17.038 | ERROR: the main setup script run by this job failed - exit code: 2

Example from: https://logs.rdoproject.org/openstack-periodic-4hr/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset002-master-upload/4da05f1/console.txt.gz#_2017-10-27_12_09_16_988

This might be an error with the te-broker on the tenant, Gabriele is looking into it.

Tags: ci
Revision history for this message
Gabriele Cerami (gcerami) wrote :

Problems with a nova node in rdocloud lead to shutting down some node. One of these node was the copy of the rh1 mirror-server for rdocloud, which acted also as dns server for the te-broker server.

From the logs

Failed to discover available identity versions when contacting https://phx2.cloud.rdoproject.org:13000/v2.0. Attempting to parse version from URL.
Unable to establish connection to https://phx2.cloud.rdoproject.org:13000/v2.0/tokens: HTTPSConnectionPool(host='phx2.cloud.rdoproject.org', port=13000): Max retries exceeded with url: /v2.0/tokens (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x1d2afd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))

I changed the /etc/resolv.conf in the te-broker server in rdocloud to break this dependency.

[root@te-broker log]# ping phx2.cloud.rdoproject.org
PING phx2.cloud.rdoproject.org (38.145.32.1) 56(84) bytes of data.
64 bytes from phx2.cloud.rdoproject.org (38.145.32.1): icmp_seq=1 ttl=63 time=0.250 ms

te-broker should now work correctly now

Changed in tripleo:
milestone: none → queens-2
tags: removed: alert
tags: removed: promotion-blocker
Changed in tripleo:
assignee: nobody → Gabriele Cerami (gcerami)
Revision history for this message
Gabriele Cerami (gcerami) wrote :

te-broker is working correctly

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.