bootstrap docker tag fails in download images script

Bug #2038923 reported by Fabricio Henrique Ramos
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Fabricio Henrique Ramos

Bug Description

Brief Description
-----------------
1 in 10 bootstrap executions bootstrap download_images.py script will fail to tag a image prior to push to local registry, the image which will cause the issue differ.

The script logs that the image download was a success, but the image has not been downloaded, which causes a 404 for the following docker tag command resulting in the bootstrap to halt.

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
Run bootstrap

Expected Behavior
------------------
Bootstrap run ok

Actual Behavior
----------------
Bootstrap fails at download images

Reproducibility
---------------
Intermittent

occurs 10% of the time

System Configuration
--------------------
-

Branch/Pull Time/Commit
-----------------------
-

Last Pass
---------
-

Timestamp/Logs
--------------
```
Image admin-2.cumulus.wrs.com:30093/wrcp-staging/k8s.gcr.io/etcd:3.5.3-0 not found on local registry, attempt to download...
Image download succeeded: admin-2.cumulus.wrs.com:30093/wrcp-staging/k8s.gcr.io/etcd:3.5.3-0
HARD FAIL - Image download failed: admin-2.cumulus.wrs.com:30093/wrcp-staging/k8s.gcr.io/etcd:3.5.3-0 404 Client Error: Not Found ("No such image: admin-2.cumulus.wrs.com:30093/wrcp-staging/k8s.gcr.io/etcd:3.5.3-0")
```

Test Activity
-------------
Developer Testing

Workaround
----------
Execute bootstrap again, or manually download/tag/push the image which caused the issue and execute bootstrap again

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/897827
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/09f63930788ee232ba6c84cb8f11f23a633a4eb5
Submitter: "Zuul (22348)"
Branch: master

commit 09f63930788ee232ba6c84cb8f11f23a633a4eb5
Author: Fabricio Henrique Ramos <email address hidden>
Date: Mon Oct 9 15:39:20 2023 -0300

    Fix docker tag error when image has not been downloaded

    Sometimes, the python docker client pull command will finish without
    raising an error, but also not having downloaded the image, resulting
    in the next docker tag instruction to fail because the image does not
    exist, which will also halt the bootstrap process. This commit fixes
    this issue.

    Test Plan:
    PASS: Execute bootstrap until the end without issues aio-sx/dc-subcloud
    PASS: Execute prestage_images playbook without issues

    Closes-Bug: 2038923
    Signed-off-by: Fabricio Henrique Ramos <email address hidden>
    Change-Id: I372e8a3c72034723ab429ff932974da14e27af3c

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Fabricio Henrique Ramos (fhramos)
tags: added: stx.9.0 stx.config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.