containers image prepare should adjust numbers of workers and exp. fallback interval upon retrying connections

Bug #1889372 reported by Bogdan Dobrelya
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Opinion
High
Unassigned

Bug Description

We should adjust the number of image prepare workers and its exponential fallback params. I've analyzed the log snippet [0] for the connection reset counts by workers versus the times the rate limiting was triggered.

[0] https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log

From 2020-07-29 03:55:31,379 to 2020-07-29 04:20:36,584:

Conn Reset Counts by a Worker PID:
     11 58412
      9 58413
     12 58415
     11 58417
Rate limit triggered (times): 32

And for an example 5 sec interval 03:55:31,379 - 03:55:36,110:

Conn Reset Counts by a Worker PID:
      3 58412
      2 58413
      3 58415
      3 58417
which is too high!

The log snippet that corresponds that interval:

 03:55:31,379 58412 DEBUG urllib3.connectionpool [ ] Resetting dropped connection: mirror.bhs1.ovh.opendev.org
 03:55:31,838 58415 DEBUG urllib3.connectionpool [ ] Resetting dropped connection: mirror.bhs1.ovh.opendev.org
 03:55:31,905 58417 DEBUG urllib3.connectionpool [ ] Resetting dropped connection: mirror.bhs1.ovh.opendev.org
 03:55:33,448 58412 INFO tripleo_common.image.image_uploader [ ] Non-2xx: id 729d9f98887eccbd14f78f8f645b341e2c97a95c, status 429, reason Too Many Requests, text {
 03:55:33,450 58412 ERROR tripleo_common.image.image_export [ ] [tripleotraincentos8/centos-binary-keepalived] HTTP error: 429 Client Error: Too Many Requests for url: http://mirror.bhs1.ovh.opendev.org:8082/v2/tripleotraincentos8/centos-binary-keepalived/blobs/sha256:f13ca690cf012da71a4ef64da912323290e2e757535cbef97add8f957943bc8b
 03:55:33,911 58415 INFO tripleo_common.image.image_uploader [ ] Non-2xx: id 6c5f7f870941abbaeabead046bcf6ee1a5cf6ae8, status 429, reason Too Many Requests, text {
 03:55:33,911 58415 ERROR tripleo_common.image.image_export [ ] [tripleotraincentos8/centos-binary-ironic-inspector] HTTP error: 429 Client Error: Too Many Requests for url: http://mirror.bhs1.ovh.opendev.org:8082/v2/tripleotraincentos8/centos-binary-ironic-inspector/blobs/sha256:61fd410bd0fa5c4c8d23bd237ee92bc57960f06cda813d9f15b180736a7557d6
 03:55:33,972 58417 INFO tripleo_common.image.image_uploader [ ] Non-2xx: id d16c6b481a41e3a4445402651e1a3b6e2d465482, status 429, reason Too Many Requests, text {
 03:55:33,972 58417 ERROR tripleo_common.image.image_export [ ] [tripleotraincentos8/centos-binary-swift-container] HTTP error: 429 Client Error: Too Many Requests for url: http://mirror.bhs1.ovh.opendev.org:8082/v2/tripleotraincentos8/centos-binary-swift-container/blobs/sha256:b85e997f7e3dfbaa343315a33805fd5621f72c8d7dde964161ae58bb48806047
 03:55:36,110 58413 INFO tripleo_common.image.image_uploader [ ] Non-2xx: id 1c20b076208bbf6de00900f6d7f069fd29a89eca, status 429, reason Too Many Requests, text {
 03:55:36,110 58413 ERROR tripleo_common.image.image_export [ ] [tripleotraincentos8/centos-binary-barbican-api] HTTP error: 429 Client Error: Too Many Requests for url: http://mirror.bhs1.ovh.opendev.org:8082/v2/tripleotraincentos8/centos-binary-barbican-api/blobs/sha256:4fabf9ba168e7dac0aeec9214180f0de97cffb344743759a3bb7b2e2869d650d

Changed in tripleo:
importance: Undecided → High
status: New → Triaged
milestone: none → victoria-3
tags: added: ci
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.opendev.org/743704

Changed in tripleo:
assignee: nobody → Bogdan Dobrelya (bogdando)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/745170

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/745170
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=d4f6eb64d4a54cfa90d4c1572ef54495f151fea5
Submitter: Zuul
Branch: master

commit d4f6eb64d4a54cfa90d4c1572ef54495f151fea5
Author: Alex Schultz <email address hidden>
Date: Thu Aug 6 09:21:59 2020 -0600

    Add exponential backoff to ratelimited requests

    If we're ratelimited on a request, we should retry the request but back
    off significantly. Previously we're not really handling 429 any
    different than other requests which may lead to the requests being
    retried too quickly.

    Change-Id: I3832332abdfd7daaf373dc0924fec268f159d774
    Related-Bug: #1889372
    Related-bug: #1889122

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/ussuri)

Related fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/745322

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/745323

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by Bogdan Dobrelya (bogdando) (<email address hidden>) on branch: master
Review: https://review.opendev.org/743704

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/ussuri)

Reviewed: https://review.opendev.org/745322
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=be81e6ee690a8a5f8bc15309d333ea8266f43af2
Submitter: Zuul
Branch: stable/ussuri

commit be81e6ee690a8a5f8bc15309d333ea8266f43af2
Author: Alex Schultz <email address hidden>
Date: Thu Aug 6 09:21:59 2020 -0600

    Add exponential backoff to ratelimited requests

    If we're ratelimited on a request, we should retry the request but back
    off significantly. Previously we're not really handling 429 any
    different than other requests which may lead to the requests being
    retried too quickly.

    Change-Id: I3832332abdfd7daaf373dc0924fec268f159d774
    Related-Bug: #1889372
    Related-bug: #1889122
    (cherry picked from commit d4f6eb64d4a54cfa90d4c1572ef54495f151fea5)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/train)

Reviewed: https://review.opendev.org/745323
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=8b379fb9f0528a1b7b034d1a3f5e1fcfe64201ef
Submitter: Zuul
Branch: stable/train

commit 8b379fb9f0528a1b7b034d1a3f5e1fcfe64201ef
Author: Alex Schultz <email address hidden>
Date: Thu Aug 6 09:21:59 2020 -0600

    Add exponential backoff to ratelimited requests

    If we're ratelimited on a request, we should retry the request but back
    off significantly. Previously we're not really handling 429 any
    different than other requests which may lead to the requests being
    retried too quickly.

    Change-Id: I3832332abdfd7daaf373dc0924fec268f159d774
    Related-Bug: #1889372
    Related-bug: #1889122
    (cherry picked from commit d4f6eb64d4a54cfa90d4c1572ef54495f151fea5)

tags: added: in-stable-train
Changed in tripleo:
status: In Progress → Opinion
assignee: Bogdan Dobrelya (bogdando) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.