[kolla] periodic: container build job can fail during push

Bug #1844697 reported by Sorin Sbarnea
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Medium
Mark Goddard
Queens
Fix Committed
Medium
Mark Goddard
Rocky
Fix Committed
Medium
Mark Goddard
Stein
Fix Committed
Medium
Mark Goddard
Train
Fix Released
Medium
Mark Goddard
tripleo
Fix Released
Medium
Unassigned

Bug Description

```
2019-09-19 01:13:00 | ERROR:kolla.common.utils.glance-api:Unknown error when pushing
2019-09-19 01:13:00 | Traceback (most recent call last):
2019-09-19 01:13:00 | File "/home/zuul/workspace/venv_build/lib/python2.7/site-packages/kolla/image/build.py", line 309, in run
2019-09-19 01:13:00 | self.push_image(image)
2019-09-19 01:13:00 | File "/home/zuul/workspace/venv_build/lib/python2.7/site-packages/kolla/image/build.py", line 335, in push_image
2019-09-19 01:13:00 | for response in self.dc.push(image.canonical_name, **kwargs):
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/docker/api/client.py", line 334, in _stream_helper
2019-09-19 01:13:00 | for chunk in json_stream(self._stream_helper(response, False)):
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/docker/utils/json_stream.py", line 66, in split_buffer
2019-09-19 01:13:00 | for data in stream_as_text(stream):
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/docker/utils/json_stream.py", line 22, in stream_as_text
2019-09-19 01:13:00 | for data in stream:
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/docker/api/client.py", line 340, in _stream_helper
2019-09-19 01:13:00 | data = reader.read(1)
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/urllib3/response.py", line 459, in read
2019-09-19 01:13:00 | raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
2019-09-19 01:13:00 | File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2019-09-19 01:13:00 | self.gen.throw(type, value, traceback)
2019-09-19 01:13:00 | File "/usr/lib/python2.7/site-packages/urllib3/response.py", line 365, in _error_catcher
2019-09-19 01:13:00 | raise ReadTimeoutError(self._pool, None, 'Read timed out.')
2019-09-19 01:13:00 | ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
2019-09-19 01:13:00 |INFO:kolla.common.utils:Attempt number: 2 to run task: PushTask(glance-api)
2019-09-19 01:13:00 |INFO:kolla.common.utils.glance-api:Trying to push the image
2019-09-19 01:13:40 |INFO:kolla.common.utils:Attempt number: 3 to run task: PushTask(glance-api)
2019-09-19 01:13:40 |INFO:kolla.common.utils.glance-api:Trying to push the image
2019-09-19 01:13:41 |INFO:kolla.common.utils:Attempt number: 4 to run task: PushTask(glance-api)
2019-09-19 01:13:41 |INFO:kolla.common.utils.glance-api:Trying to push the image
2019-09-19 01:13:42 |INFO:kolla.common.utils:Attempt number: 1 to run task: PushTask(nova-scheduler)
2019-09-19 01:13:42 |INFO:kolla.common.utils.nova-scheduler:Trying to push the image
```

http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/51d361e/logs/build.log.txt.gz

Apparently the code does not have any retry mechanism on it, I think it should retry at least 3 times withing 10 minutes before failing, so we can avoid failing the entire job just because an external service is restarted or the network connectivity is bit flaky.

The retry should be implemented around https://github.com/openstack/kolla/blob/master/kolla/image/build.py#L305-L324

Tags: alert
Sorin Sbarnea (ssbarnea)
tags: added: alert
summary: - periodic: container build job can fail during push
+ [koilla] periodic: container build job can fail during push
Mark Goddard (mgoddard)
summary: - [koilla] periodic: container build job can fail during push
+ [kolla] periodic: container build job can fail during push
Revision history for this message
Zoli Caplovic (zcaplovic) wrote :

It seems to be a glitch - all push attempts before and after this error ended with "Pushed successfully". RDO registry logs have been checked - ho issue found.

I recommend to retry the push.

Mark Goddard (mgoddard)
Changed in kolla:
importance: Undecided → Medium
status: New → Triaged
Sorin Sbarnea (ssbarnea)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (master)

Fix proposed to branch: master
Review: https://review.opendev.org/683200

Changed in kolla:
assignee: nobody → Mark Goddard (mgoddard)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (master)

Reviewed: https://review.opendev.org/683200
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=f8ded663891f79c289156ee61d4e02197d80ce7a
Submitter: Zuul
Branch: master

commit f8ded663891f79c289156ee61d4e02197d80ce7a
Author: Mark Goddard <email address hidden>
Date: Thu Sep 19 17:20:50 2019 +0100

    Fix retries when pushing images

    Currently the retry mechanism is broken for pushing because the image
    state gets set to an error state, and is never reset to 'built'. This
    prevents the PushTask from setting success to True.

    This change sets the image state to 'built' if the push succeeds,
    ensuring it overrides any previous failures.

    Change-Id: I93fc0e383da8fec6b3ca31f8094321c2a0c3af71
    Closes-Bug: #1844697

Changed in kolla:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/683397

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/683399

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/683406

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/rocky)

Reviewed: https://review.opendev.org/683399
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=7d4b06f9b0fe9709385fd7af3fe39b5baa865a95
Submitter: Zuul
Branch: stable/rocky

commit 7d4b06f9b0fe9709385fd7af3fe39b5baa865a95
Author: Mark Goddard <email address hidden>
Date: Thu Sep 19 17:20:50 2019 +0100

    Fix retries when pushing images

    Currently the retry mechanism is broken for pushing because the image
    state gets set to an error state, and is never reset to 'built'. This
    prevents the PushTask from setting success to True.

    This change sets the image state to 'built' if the push succeeds,
    ensuring it overrides any previous failures.

    Change-Id: I93fc0e383da8fec6b3ca31f8094321c2a0c3af71
    Closes-Bug: #1844697
    (cherry picked from commit f8ded663891f79c289156ee61d4e02197d80ce7a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/stein)

Reviewed: https://review.opendev.org/683397
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=f79df0f25e096f984c1c83b7048979ca2b167555
Submitter: Zuul
Branch: stable/stein

commit f79df0f25e096f984c1c83b7048979ca2b167555
Author: Mark Goddard <email address hidden>
Date: Thu Sep 19 17:20:50 2019 +0100

    Fix retries when pushing images

    Currently the retry mechanism is broken for pushing because the image
    state gets set to an error state, and is never reset to 'built'. This
    prevents the PushTask from setting success to True.

    This change sets the image state to 'built' if the push succeeds,
    ensuring it overrides any previous failures.

    Change-Id: I93fc0e383da8fec6b3ca31f8094321c2a0c3af71
    Closes-Bug: #1844697
    (cherry picked from commit f8ded663891f79c289156ee61d4e02197d80ce7a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/queens)

Reviewed: https://review.opendev.org/683406
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=eeb2b1ec1f26d867583cc8dc9f8050270e124538
Submitter: Zuul
Branch: stable/queens

commit eeb2b1ec1f26d867583cc8dc9f8050270e124538
Author: Mark Goddard <email address hidden>
Date: Thu Sep 19 17:20:50 2019 +0100

    Fix retries when pushing images

    Currently the retry mechanism is broken for pushing because the image
    state gets set to an error state, and is never reset to 'built'. This
    prevents the PushTask from setting success to True.

    This change sets the image state to 'built' if the push succeeds,
    ensuring it overrides any previous failures.

    Change-Id: I93fc0e383da8fec6b3ca31f8094321c2a0c3af71
    Closes-Bug: #1844697

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 6.2.4

This issue was fixed in the openstack/kolla 6.2.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 9.0.0.0rc1

This issue was fixed in the openstack/kolla 9.0.0.0rc1 release candidate.

wes hayutin (weshayutin)
Changed in tripleo:
status: New → Triaged
milestone: none → ussuri-2
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 7.1.0

This issue was fixed in the openstack/kolla 7.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 8.0.2

This issue was fixed in the openstack/kolla 8.0.2 release.

wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-2 → ussuri-3
wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.