periodic integration/component jobs failing "[Zuul] Log Stream did not terminate"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
At [1] for periodic *train* integration pipeline standalone, [2] fs1 master baremetal component [3] for scen1 and [4] for scen4 cinder component and many other places the jobs are failing with 'strange' logs. The main fail is simply (e.g. [4]):
* 2020-08-06 01:56:50.801636 | primary | TASK [undercloud-setup : Run the package installation script] ******************
* 2020-08-06 02:59:45.524791 | [Zuul] Log Stream did not terminate
In this particular case it is strange because the deployment is successful [5] including a green tempest run [6]. In each case there is something slightly different. In baremental component fs1 [2] the output in the undercloud install logs [7] includes blocks of "����".
At [8] the master cloudops component job fails with:
* 2020-08-06 06:33:25.512079 | primary | TASK [build-
* 2020-08-06 07:12:30.702593 | [Zuul] Log Stream did not terminate
And there are no more logs available [9]. Finally at [10] ussuri scen2 standalone cloudops component:
* 2020-08-06 04:58:15.798502 | primary | TASK [undercloud-setup : Run the package installation script] ******************
* 2020-08-06 05:58:40.793649 | [Zuul] Log Stream did not terminate
But the overcloud deployment is successful [11] with a green tempest run [12]. I don't think this is a cloud specific problem. Many of the examples here are running in vexx but this last one [12] is "cloud: rdo-cloud-tripleo" [13]. Same for the scen1/4 cinder component jobs referenced above [3][4].
[1] https:/
[2] https:/
[3] https:/
[4] https:/
[5] https:/
[6] https:/
[7] https:/
[8] https:/
[9] https:/
[10] https:/
[11] https:/
[12] https:/
[13] https:/
13:24 < jpena> chkumar|rover, marios|ruck: anything running on vexxhost is going to have trouble until they make networking stable again /bugs.launchpad .net/tripleo/ +bug/1890571
13:24 < chkumar|rover> jpena: registry is also on vexxhost na?
13:24 < jpena> yep
13:24 < chkumar|rover> oh
13:25 < jpena> we moved all our infra to vexxhost (except lists.rdo and www.rdo, which are still WIP)
13:26 < jpena> it's been quite stable so far, until we hit today's issues
13:27 < marios|ruck> jpena: ack thanks but i don't think this one is restricted to vexx i have examples from rdo cloud too
13:27 < marios|ruck> chkumar|rover: jpena: there https:/
13:27 < jpena> marios|ruck: right, but they are interacting with zuul components running on vexxhost
13:27 < marios|ruck> jpena: i see
13:28 < marios|ruck> jpena: cos this is really random the logs are weird nothing makes sense and they are all different
13:28 < jpena> marios|ruck: yes. Every error message you could find is happening today :-/.