Intermittently some jobs are timing out while gathering facts on different tasks : [tripleo-inventory : Ensure gather_facts has been run against localhost] or [validate-undercloud : gather facts used by role]

Bug #1903961 reported by Sandeep Yadav
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Ronelle Landy

Bug Description

Description:-

Intermittently some jobs are timing out on TASK [tripleo-inventory : Ensure gather_facts has been run against localhost]

We noticed this on Centos7 train branch in multiple jobs :- https://review.rdoproject.org/zuul/buildset/c8e285a731c04d78bfdf1f69303be237

Logs:-

https://logserver.rdoproject.org/openstack-periodic-integration-stable3-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/92fd4d9/job-output.txt
~~~
2020-11-11 14:34:47.681248 | primary | PLAY [Inventory the overcloud] *************************************************
2020-11-11 14:34:47.722653 | primary |
2020-11-11 14:34:47.722873 | primary | TASK [tripleo-inventory : Ensure gather_facts has been run against localhost] ***
2020-11-11 14:34:47.722947 | primary | Wednesday 11 November 2020 14:34:47 +0000 (0:00:00.141) 1:04:09.138 ****
2020-11-11 16:47:09.627915 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master]
2020-11-11 16:47:09.628793 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base/post.yaml@master]
~~~

https://logserver.rdoproject.org/openstack-periodic-integration-stable3-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-train-upload/ef5306c/job-output.txt

~~~
2020-11-11 14:31:41.708800 | primary | TASK [tripleo-inventory : Ensure gather_facts has been run against localhost] ***
2020-11-11 14:31:41.708908 | primary | Wednesday 11 November 2020 14:31:41 +0000 (0:00:00.104) 1:04:25.919 ****
2020-11-11 15:42:44.233506 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master]
2020-11-11 15:42:44.235288 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base-upload/post.yaml@master]
~~~

Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :

Timeout also seen for stein/rocky on multiple job on a different TASK [validate-undercloud : gather facts used by role] :-

https://review.rdoproject.org/zuul/buildset/38652381645b4a00810ba8b734ee4696

https://logserver.rdoproject.org/openstack-periodic-integration-stable4-5/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-rocky/039b94b/job-output.txt
~~~
2020-11-11 14:32:40.386466 | primary | TASK [validate-undercloud : gather facts used by role] *************************
2020-11-11 14:32:40.386573 | primary | Wednesday 11 November 2020 14:32:40 +0000 (0:00:00.102) 0:40:48.508 ****
2020-11-11 16:57:24.263280 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master]
2020-11-11 16:57:24.264020 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base/post.yaml@master]
~~~

https://logserver.rdoproject.org/openstack-periodic-integration-stable4-5/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-stein/8cd8435/job-output.txt

~~~
2020-11-11 14:54:26.844284 | primary | TASK [validate-undercloud : gather facts used by role] *************************
2020-11-11 14:54:26.847499 | primary | Wednesday 11 November 2020 14:54:26 +0000 (0:00:00.091) 0:37:35.857 ****
2020-11-11 18:01:37.449407 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master]
2020-11-11 18:01:37.451166 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base/post.yaml@master]
~~~

We also noticed below timeout:-

https://logserver.rdoproject.org/openstack-periodic-integration-stable4-5/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset038-stein/a162f24/job-output.txt

~~~
2020-11-11 15:39:36.949346 | primary | TASK [ensure the deployment result has been read into memory] ******************
2020-11-11 15:39:36.949731 | primary | Wednesday 11 November 2020 15:39:36 +0000 (0:00:15.832) 0:46:20.109 ****
2020-11-11 17:02:44.992298 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master]
2020-11-11 17:02:44.993332 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base/post.yaml@master]
~~~

^^ this was earlier reported in https://bugs.launchpad.net/tripleo/+bug/1883843

summary: - Intermittently some jobs are timing out on TASK [tripleo-inventory :
- Ensure gather_facts has been run against localhost]
+ Intermittently some jobs are timing out while gathering facts on
+ different tasks : [tripleo-inventory : Ensure gather_facts has been run
+ against localhost] or [validate-undercloud : gather facts used by role]
tags: added: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :

a lot of these stein jobs have been removed. The line is rerunning - will see if we get the same timeout with the jobs left

wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → Won't Fix
Revision history for this message
Marios Andreou (marios-b) wrote :

just commented at https://review.rdoproject.org/r/#/c/31114/1//COMMIT_MSG we may still want this?

Revision history for this message
Ronelle Landy (rlandy) wrote :
Changed in tripleo:
assignee: nobody → Ronelle Landy (rlandy)
status: Won't Fix → In Progress
Ronelle Landy (rlandy)
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.