Content provider and other tripleo jobs are often timeouting on upstream vexxhost ca-ymq-1 region
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
Content providers job are building containers for all other jobs. On upstream cloud from vexxhost (ca-ymq-1 region) these jobs often time out. The building is going very slow there.
Overall pass rate on this region is about 65%:
Overall rate excluding this region is about 95%:
We have already increased timeout for this job[1], but it's still so slow that can't finish containers build in time.
Example of timeouting job:
It built only 20 containers in 2h 15min: https:/
Example of regular job from RAX cloud DFW region:
https:/
built all ~120 containers in 35 mins.
[1] https:/
summary: |
- Content provider jobs are often timeouting on upstream vexxhost ca-ymq-1 - region + Content provider and other tripleo jobs are often timeouting on upstream + vexxhost ca-ymq-1 region |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
milestone: | wallaby-rc1 → xena-1 |
Changed in tripleo: | |
status: | Triaged → Fix Released |
Problem exists also in scenario jobs, for example failed jobs in gates:
https:/ /92ed9dc19b3bf9 4f48d5- 598e1d61c0aab85 aa3b67b337ca2c5 56.ssl. cf5.rackcdn. com/772598/ 2/gate/ tripleo- ci-centos- 8-undercloud- containers/ 9d272cd/ logs/undercloud /home/zuul/ undercloud_ install. log ['neutron_ db_sync' ]
[ERROR]: Container(s) which failed to be created by podman_container module:
db sync took 23 min instead of usual 1 min. /92ed9dc19b3bf9 4f48d5- 598e1d61c0aab85 aa3b67b337ca2c5 56.ssl. cf5.rackcdn. com/772598/ 2/gate/ tripleo- ci-centos- 8-undercloud- containers/ 9d272cd/ logs/undercloud /var/log/ containers/ stdouts/ neutron_ db_sync. log
https:/
Every operation takes a unusually long time and jobs fail on vexx upstream cloud.