tripleo-ci-centos-8-undercloud-containers failing with "Errors during downloading metadata for repository 'centos-opstools'"

Bug #1956234 reported by Jiri Podivin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Won't Fix
Critical
Unassigned

Bug Description

The error occurs during DLRN build of the tested package:

---
DEBUG: Errors during downloading metadata for repository 'centos-opstools':
DEBUG: - Curl error (28): Timeout was reached for http://mirror.centos.org/centos/8/opstools/x86_64/collectd-5/repodata/9fda86d5505b845fa8f03abc506932f2bac9d6a565246e93335babbe649de481-primary.xml.gz [Connection timed out after 30000 milliseconds]
DEBUG: - Curl error (28): Timeout was reached for http://mirror.centos.org/centos/8/opstools/x86_64/collectd-5/repodata/0610024676dd89cad14ed72a7e8000578bc6d2927f648b6b73f4a6132635cab1-filelists.xml.gz [Connection timed out after 30000 milliseconds]
DEBUG: Error: Failed to download metadata for repo 'centos-opstools': Yum repo downloading error: Downloading error(s): repodata/9fda86d5505b845fa8f03abc506932f2bac9d6a565246e93335babbe649de481-primary.xml.gz - Cannot download, all mirrors were already tried without success; repodata/0610024676dd89cad14ed72a7e8000578bc6d2927f648b6b73f4a6132635cab1-filelists.xml.gz - Cannot download, all mirrors were already tried without success
---

The links appear to be accessible from the developers workstation.

Logs:

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c83/802901/17/check/tripleo-ci-centos-8-undercloud-containers/c835eb6/logs/undercloud/home/zuul/DLRN/data/repos/component/validation/fa/f7/faf7de20df62a92c1e48737b320a77c909c326f8_dev/rpmbuild.log

Tags: ci
Jiri Podivin (jpodivin)
Changed in tripleo:
importance: High → Critical
Revision history for this message
Jiri Podivin (jpodivin) wrote :

The issue appears to be unique to openstack/validations-libs, other repositories run their... without issue.[1]

Examination of the logs indicates that openstack/validations-libs execution of the affected job differs from that of other projects due to inclusion of the 'build-test-packages' role[2] which does not seem to be included in the execution logs of other projects, such as python-tripleoclient.[3]

Curiously, the logs of the passing jobs do not include any mentions of the 'Run DLRN gate role' task which triggers the 'build-test-packages', not even as a skipped item.
This would seem to indicate that the task isn't present in the relevant tasks file at all.

[1]https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-undercloud-containers&branch=master
[2]https://opendev.org/openstack/tripleo-quickstart-extras/src/branch/master/roles/undercloud-setup/tasks/main.yml#L29-L35
[3]https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a9f/822835/4/check/tripleo-ci-centos-8-undercloud-containers/a9f6ade/job-output.txt

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

So, from 3-jan-2022 i see two jobs failed with mirror issues:

1st (3-jan): https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c83/802901/17/check/tripleo-ci-centos-8-undercloud-containers/c835eb6/logs/delorean_logs/component/validation/fa/f7/faf7de20df62a92c1e48737b320a77c909c326f8_dev/rpmbuild.log.txt.gz

  - Curl error (28): Timeout was reached for http://mirror.centos.org/centos/8/opstools/x86_64/collectd-5/repodata/9fda86d5505b845fa8f03abc506932f2bac9d6a565246e93335babbe649de481-primary.xml.gz [Connection timed out after 30000 milliseconds]

2nd (4-jan): https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_dc8/802901/18/check/tripleo-ci-centos-8-undercloud-containers/dc8009d/logs/delorean_logs/component/validation/86/a5/86a5de3f7fb14dd20135032e67126d0e4e2d4619_dev/rpmbuild.log.txt.gz

  - Curl error (28): Timeout was reached for http://mirror.centos.org/centos/8-stream/virt/x86_64/advancedvirt-common/repodata/repomd.xml [Connection timed out after 30000 milliseconds]

Two timeouts pulling from mirror.centos.org to different repos and also some jobs passed between those so I'd say it's a infra issue, not config or job issue.

I see that in the same jobs some packages were installed from repos ci infra mirrors so i guess the problem was specific to mirror.centos.org. Something that could improve this is to move centos repos configs from mirror.centos.org to use mirrorlist to identify closer mirrors. IIRC, jobs in oooq use the configs coming from DLRN repo?, i need to check. Other option would be to use the opendev mirrors but that would be oooq specific and some logic from tripleo-ci would need to be implemented to modify the mock config on the fly.

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on validations-libs (master)

Change abandoned by "Jiri Podivin <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/validations-libs/+/823495
Reason: It doesn't appear that this patch would actually help things. The affected jobs appear to have recovered.

Revision history for this message
Jiri Podivin (jpodivin) wrote :

The issue seems to have disappeared.

Changed in tripleo:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.