ansible download_images slow to error out, and not multithreaded

Bug #1933863 reported by John Kung
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
John Kung

Bug Description

Brief Description
-----------------
Ansible Replay with Invalid Docker Registries takes 4 hours to time out.

Severity
--------

Minor: System/Feature is usable, though inefficient startup performance

Steps to Reproduce
------------------
Change your localhost.yml file to contain "invalid.docker.registry" for the location of your docker registries.

time ansible-playbook -v /usr/share/ansible/stx-ansible/playbooks/bootstrap.yml

Expected Behavior
------------------
We are prompted by error telling us that we have "invalid_docker_registries", and the playbook should fail
early (within minutes) without unnecessary delays.

Actual Behavior
----------------
Takes several hours to timeout; observed more than 4 hours.

Reproducibility
---------------
Reproducible

System Configuration
--------------------
One node system; though as this is on initial bootsrap/restore, this applied to multi-node system as well.

Branch/Pull Time/Commit
-----------------------
2021-06-09_18-58-11

Last Pass
---------
No

Timestamp/Logs
--------------
2021-05-29-06-56

Test Activity
-------------
System Test, automated regression.

Workaround
----------
Wait for timeout.

Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.config
tags: added: stx.6.0
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → John Kung (john-kung)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/798389
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/c0bc0d61a48747adc1df3a7667c206550f0b31c9
Submitter: "Zuul (22348)"
Branch: master

commit c0bc0d61a48747adc1df3a7667c206550f0b31c9
Author: John Kung <email address hidden>
Date: Mon Jun 28 17:06:47 2021 -0500

    Fix ansible download_images handle invalid registry

    Exit download_images on exception or when number of failed downloads
    threshold is exceeded. The case for invalid registries or other error,
    such as invalid DNS, is measured about 20 minutes vs 4 hours in the bug
    reported initial error case.

    The playbook will retry and pull the remaining required images; as
    docker will not pull images for the already downloaded images.

    Tests Performed:
    o initial installation and sanity which performs download_images
    o modify registries to point to invalid.docker.registry and
      measure timeouts
    o perform ansible-playbook replay with invalid registries
      Either APIError with Forbidden or 'no auth host' may occur
     (replay with invalid registry or invalid DNS)
    o perform ansible-playbook replay with good registries

    Closes-Bug: 1933863
    Signed-off-by: John Kung <email address hidden>
    Change-Id: Id2dec495ae6f956af7c66f0bcf88931991ece0c4

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.