Creating the default plan can fail when communicating with swift. It should be retried

Bug #1634195 reported by Dougal Matthews
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Ryan Brady

Bug Description

I have seen this happen in a few CI runs, typically they pass when re-trying. So this is a transient issue that we can handle in the code since it seems common enough to pin down.

2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor [-] Failed to run action [action_ex_id=c22f98ba-86e8-4214-9621-b6e04d4dbcc3, action_cls='<class 'mistral.actions.action_factory.UploadTemplatesAction'>', attributes='{}', params='{u'container': u'overcloud'}']
 ''
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor Traceback (most recent call last):
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/mistral/engine/default_executor.py", line 90, in run_action
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor result = action.run()
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 57, in run
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor self.container)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/utils/tarball.py", line 39, in tarball_extract_to_swift_container
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor headers={'X-Detect-Content-Type': 'true'}
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1796, in put_object
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor response_dict=response_dict)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1647, in _retry
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor service_token=self.service_token, **kwargs)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1278, in put_object
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor conn.putrequest(path, headers=headers, data=data)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 447, in putrequest
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor return self.request('PUT', full_path, data, headers, files)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 437, in request
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor files=files, **self.requests_args)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 420, in _request
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor return self.request_session.request(*arg, **kwarg)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor resp = self.send(prep, **send_kwargs)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor r = adapter.send(request, **kwargs)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 434, in send
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor r = low_conn.getresponse(buffering=True)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor response.begin()
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib64/python2.7/httplib.py", line 444, in begin
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor version, status, reason = self._read_status()
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor raise BadStatusLine(line)
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor BadStatusLine: ''
2016-10-16 22:11:56.363 9863 ERROR mistral.engine.default_executor

Revision history for this message
wes hayutin (weshayutin) wrote :

Seeing this in the RDO pipeline

https://thirdparty-logs.rdoproject.org/jenkins-tripleo-quickstart-periodic-newton-delorean-ha-12/undercloud/var/log/mistral/executor.log.gz#_2016-10-17_17_24_15_341

This is most likely a blocker for import, not sure though because it doesn't recreate 100% of the time.

Revision history for this message
Matt Young (halcyondude) wrote :

We have recreated this on both minimal and HA deployments, it seems to be correlated with memory pressure on the UC as well.

John Trowbridge (trown)
tags: added: newton-backport-potential
John Trowbridge (trown)
Changed in tripleo:
importance: Medium → High
Ryan Brady (rbrady)
Changed in tripleo:
assignee: nobody → Ryan Brady (rbrady)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/388285

Changed in tripleo:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/389124

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by Ryan Brady (<email address hidden>) on branch: master
Review: https://review.openstack.org/388285
Reason: abandoning any efforts to change this workflow to call directly to swift actions in mistral.actions.openstack. There is no way to pass args to swift client connection via a workflow at this time

Ryan Brady (rbrady)
Changed in tripleo:
status: In Progress → Invalid
Changed in tripleo:
status: Invalid → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/389124
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=5c575970cf48152c95f780a85e6a2baae7d29f09
Submitter: Jenkins
Branch: master

commit 5c575970cf48152c95f780a85e6a2baae7d29f09
Author: Ryan Brady <email address hidden>
Date: Thu Oct 20 07:14:27 2016 -0400

    Sets defaults in swift connection related to retries

    This patch sets the following defaults in the swfit connection to
    manage retries.

    retries – Number of times to retry the request before failing
    starting_backoff – initial delay between retries (seconds)
    max_backoff – maximum delay between retries (seconds)

    Change-Id: I26fcb994f91309eedb9d8ccc17939bc9f4c6a116
    Closes-Bug: #1634195

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.5.0

This issue was fixed in the openstack/tripleo-common 5.5.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.