tripleo-ci-centos-7-containers-multinode job is timing out more often than it passes & blocking gate
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Marios Andreou |
Bug Description
The tripleo-
example of timeout at [1] but there are many examples. Some more notes from [2]
17:02 tripleo-
found: install-undercloud and deploy-overcloud take each ~1h 30m so build duriation is very likely to exceed the 3h timeout.
inside install-undercloud the **tripleo-
this is *not* caused by flaky infrastructure, this is caused by our too long jobs.
(mwhahaha) This is likely caused by container pulls running long. If you look at the history it can run in 2:16 total so it's not the code itself that's causing the excessive run length.
[1] http://
[2] https:/
Changed in tripleo: | |
assignee: | nobody → Marios Andreou (marios-b) |
Changed in tripleo: | |
milestone: | none → stein-2 |
tags: | added: alert |
Changed in tripleo: | |
importance: | Undecided → Critical |
tags: | added: deployment-time |
Changed in tripleo: | |
milestone: | stein-2 → stein-3 |
Not sure if accurate as we haven't had a TIMEOUT in over a day http:// zuul.openstack. org/builds? pipeline= gate&result= post_failure& result= timed_out& result= failure& job_name= tripleo- ci-centos- 7-containers- multinode
I'm wondering if this was related to all the scenario jobs we had on stable branches in the gate causing extra transit load.