[CI] gate py3{6,7,8} and tox-cover jobs timeout

Bug #1860332 reported by Rodolfo Alonso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Rodolfo Alonso

Bug Description

In heavy loaded systems, neutron py36 and py37 jobs tend to timeout (2400 secs).

Example:
https://700a7e3ec90813bffe3e-a140c73e010d1e58296165dc52d0a2c3.ssl.cf2.rackcdn.com/702248/5/gate/openstack-tox-py36/c9c1924/job-output.txt

Review 1: new timeouts detected in py38 and tox-cover. Commented in c#6 and c#8.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

I am not that familiar with job templates, but as I see here:
https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/jobs.yaml#L159

The timeout is 2400sec, so as a workaround (?) that should be increased to some bigger like 3600.

I can imagine that the increase is due to gradual increase in code base and test numbers. Of course this is just my optimism that force me to tell this.

Perhaps adding experimental jobs to run unit tests with profiling?

tags: added: gate-failure unittest
Changed in neutron:
status: New → Confirmed
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Revision history for this message
Lajos Katona (lajos-katona) wrote :

some fancy graphs from openstack-health (though for me only the job run time graph loaded successfully):
http://status.openstack.org/openstack-health/#/job/openstack-tox-py37-neutron?duration=P14D
http://status.openstack.org/openstack-health/#/job/openstack-tox-lower-constraints-neutron?duration=P14D

For me it seems that not all executions are loaded, but anyway I put it here for some reference.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/703751

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/703751
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4f94c55ed3c8e68721e7089c9fbae496c852f9e4
Submitter: Zuul
Branch: master

commit 4f94c55ed3c8e68721e7089c9fbae496c852f9e4
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Jan 22 10:05:34 2020 +0000

    Increase tox-py3{6,7} and lower-constraints timeout to 3600 seconds

    As seen in the reported bug, in heavy loaded CI systems, the Neutron
    openstack-tox-py3{6,7} and openstack-tox-lower-constraints jobs tend
    to timeout.

    Change-Id: I3f7aff297e8c1797eb23adac7b29dc65619c1e8a
    Closes-Bug: #1860332

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : Re: [CI] gate py3{6,7} jobs timeout
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/704319

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : Re: [CI] gate py3{6,7} jobs timeout
summary: - [CI] gate py3{6,7} jobs timeout
+ [CI] gate py3{6,7,8} and tox-cover jobs timeout
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/704851

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/704319
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=938581d9bd2ff46165c247d7c42d9492459522a8
Submitter: Zuul
Branch: master

commit 938581d9bd2ff46165c247d7c42d9492459522a8
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Jan 27 13:31:00 2020 +0000

    Increate tox-py38 timeout to 3600 seconds

    Days after [1] was merged, [2] was merged too, adding a new CI job:
    py38. This patch amends [1] increasing the timeout for this new job.

    [1] https://review.opendev.org/#/c/703751
    [2] https://review.opendev.org/#/c/693401

    Change-Id: I128f3c4621d2354ac6b0d07ccf34b84862efe7de
    Closes-Bug: #1860332

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/704851
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=b7dc0ac63a215047080a8ea057cb56d8e5b4fba6
Submitter: Zuul
Branch: master

commit b7dc0ac63a215047080a8ea057cb56d8e5b4fba6
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Wed Jan 29 17:23:24 2020 +0000

    Increase tox-cover timeout to 4800 seconds

    Change-Id: I2200329c9c5402562b84794f669177728794a5cd
    Related-Bug: #1860332

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.0.0.0b1

This issue was fixed in the openstack/neutron 16.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.