validate-tempest role erroring trying to send mail

Bug #1806495 reported by Alex Schultz on 2018-12-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Sorin Sbarnea

Bug Description

http://logs.openstack.org/90/614290/2/gate/tripleo-ci-centos-7-standalone/5c77eaf/job-output.txt.gz#_2018-12-03_22_17_02_055488

2018-12-03 22:17:02.080981 | primary | Monday 03 December 2018 22:17:02 +0000 (0:00:01.164) 1:07:44.046 *******
2018-12-03 22:17:03.387782 | primary | fatal: [undercloud]: FAILED! => {
2018-12-03 22:17:03.388071 | primary | "changed": true,
2018-12-03 22:17:03.388534 | primary | "cmd": "LOG_PATH='90/614290/2/gate/tripleo-ci-centos-7-standalone/5c77eaf' ./tempestmail.py -c config.yaml --job \"tripleo-ci-centos-7-standalone\" --file \"/home/zuul/tempest.log\" --log-url \"http://logs.openstack.org\" --skip-file \"/home/zuul/tempestmail/tempest_skip_master.yml\"",
2018-12-03 22:17:03.388612 | primary | "delta": "0:00:00.592809",
2018-12-03 22:17:03.388706 | primary | "end": "2018-12-03 22:17:03.363678",
2018-12-03 22:17:03.388793 | primary | "rc": 1,
2018-12-03 22:17:03.388887 | primary | "start": "2018-12-03 22:17:02.770869"
2018-12-03 22:17:03.388920 | primary | }
2018-12-03 22:17:03.388951 | primary |
2018-12-03 22:17:03.388991 | primary | STDERR:
2018-12-03 22:17:03.389022 | primary |
2018-12-03 22:17:03.389159 | primary | 2018-12-03 22:17:03,024 DEBUG tempestmail.TempestMail: Loading configuration
2018-12-03 22:17:03.389289 | primary | 2018-12-03 22:17:03,132 DEBUG Mail: There are tests with failed result
2018-12-03 22:17:03.389396 | primary | 2018-12-03 22:17:03,132 DEBUG Mail: Rendering template
2018-12-03 22:17:03.389591 | primary | 2018-12-03 22:17:03,155 DEBUG urllib3.connectionpool: Starting new HTTP connection (1): tempest-sendmail.tripleo.org
2018-12-03 22:17:03.389677 | primary | Traceback (most recent call last):
2018-12-03 22:17:03.389792 | primary | File "./tempestmail.py", line 407, in <module>
2018-12-03 22:17:03.389856 | primary | sys.exit(main())
2018-12-03 22:17:03.389950 | primary | File "./tempestmail.py", line 403, in main
2018-12-03 22:17:03.390007 | primary | tmc.checkJobs()
2018-12-03 22:17:03.390107 | primary | File "./tempestmail.py", line 354, in checkJobs
2018-12-03 22:17:03.390225 | primary | send_mail.send_mail(self.args.job, last, self.args.output)
2018-12-03 22:17:03.390333 | primary | File "./tempestmail.py", line 187, in send_mail
2018-12-03 22:17:03.390437 | primary | self._send_mail_api(addresses, message, subject)
2018-12-03 22:17:03.390543 | primary | File "./tempestmail.py", line 177, in _send_mail_api
2018-12-03 22:17:03.390647 | primary | requests.post(self.config.api_server, data=data)
2018-12-03 22:17:03.390810 | primary | File "/usr/lib/python2.7/site-packages/requests/api.py", line 112, in post
2018-12-03 22:17:03.390935 | primary | return request('post', url, data=data, json=json, **kwargs)
2018-12-03 22:17:03.391076 | primary | File "/usr/lib/python2.7/site-packages/requests/api.py", line 58, in request
2018-12-03 22:17:03.391193 | primary | return session.request(method=method, url=url, **kwargs)
2018-12-03 22:17:03.391342 | primary | File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 512, in request
2018-12-03 22:17:03.391432 | primary | resp = self.send(prep, **send_kwargs)
2018-12-03 22:17:03.391576 | primary | File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 622, in send
2018-12-03 22:17:03.391674 | primary | r = adapter.send(request, **kwargs)
2018-12-03 22:17:03.391837 | primary | File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 513, in send
2018-12-03 22:17:03.391937 | primary | raise ConnectionError(e, request=request)
2018-12-03 22:17:03.392427 | primary | requests.exceptions.ConnectionError: HTTPConnectionPool(host='tempest-sendmail.tripleo.org', port=8080): Max retries exceeded with url: /api/v1.0/sendmail (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f436e514f50>: Failed to establish a new connection: [Errno 111] Connection refused',))
2018-12-03 22:17:03.392470 | primary |
2018-12-03 22:17:03.392503 | primary |
2018-12-03 22:17:03.392539 | primary | MSG:
2018-12-03 22:17:03.392569 | primary |
2018-12-03 22:17:03.392628 | primary | non-zero return code
2018-12-03 22:17:03.392674 | primary | ...ignoring

Sorin Sbarnea (ssbarnea) wrote :

This task can fail with other errors like the one from http://logs.openstack.org/21/625621/2/check/tripleo-ci-centos-7-containers-multinode-queens/56b19bf/job-output.txt.gz

DEBUG requests.packages.urllib3.connectionpool: Starting new HTTP connection (1): tempest-sendmail.tripleo.org
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))

We need to track all of them on elastic-recheck, sadly I am not aware of any single query for a specific ansible task failure because the way ansible prints failures.

See by attemot to identify them: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22requests.exceptions.ConnectionError%3A%20HTTPConnectionPool(host%3D'tempest-sendmail.tripleo.org'%5C%22%20OR%20message%3A%5C%22DEBUG%20tempestmail.TempestMail%3A%20Loading%20configuration%5C%22

Fix proposed to branch: master
Review: https://review.openstack.org/625915

Changed in tripleo:
assignee: nobody → Sorin Sbarnea (ssbarnea)
status: Triaged → In Progress
Sorin Sbarnea (ssbarnea) wrote :

Please check https://review.openstack.org/625915 which should allow us to track its failure with elastic-recheck.

Sorin Sbarnea (ssbarnea) wrote :

The same task can also fail with another error:

DEBUG requests.packages.urllib3.connectionpool: Starting new HTTP connection (1): tempest-sendmail.tripleo.org
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))

Having a consistent warning message would make it possible to track it over time.

Reviewed: https://review.openstack.org/625915
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=dd975856144bb2c8bfd1a97a0e38e96d7e871763
Submitter: Zuul
Branch: master

commit dd975856144bb2c8bfd1a97a0e38e96d7e871763
Author: Sorin Sbarnea <email address hidden>
Date: Tue Dec 18 13:45:13 2018 +0000

    Makes tempest sendmail failure easier to track

    This task can fail with various errors and this change enables us
    to spot its failure with a single LogStash query.

    In order to avoid false positive string match when bash is using verbose
    more we encoded the W char in the source, so a string match would happen
    only when printed and not on the source code.

    Change-Id: Icbb7f71aba2e9d2a4cf4e4c07f953fe3613e6707
    Partial-Bug: 1806495

Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
milestone: stein-3 → stein-rc1
Changed in tripleo:
milestone: stein-rc1 → train-1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers