tripleo-ci-centos-8-scenario010-standalone timing out frequently

Bug #1881087 reported by Rabi Mishra
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Incomplete
Critical
Unassigned

Bug Description

I've seen this failure many times in the last few days.

Times out when doing the octavia configuration.

TASK [Configure octavia on overcloud] ******************************************
Thursday 28 May 2020 06:36:39 +0000 (0:00:00.105) 0:38:48.968 **********

Noticed at:

https://563e6eae89541637c716-e0b3841327e693df7529796034dab315.ssl.cf1.rackcdn.com/731035/2/check/tripleo-ci-centos-8-scenario010-standalone/58eb06f/logs/undercloud/home/zuul/standalone_deploy.log

Octavia playbook runs for only 2 mins and then stuck

https://563e6eae89541637c716-e0b3841327e693df7529796034dab315.ssl.cf1.rackcdn.com/731035/2/check/tripleo-ci-centos-8-scenario010-standalone/58eb06f/logs/undercloud/home/zuul/standalone-ansible-eybdjmbe/octavia-ansible/octavia-ansible.log

Rabi Mishra (rabi)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
description: updated
Revision history for this message
Rabi Mishra (rabi) wrote :
Revision history for this message
Carlos Goncalves (cgoncalves) wrote :

tripleo deploy start: 5:52:10
octavia playbook start: 6:36:39
octavia playbook last log line: 06:38:53
tripleo deploy timeout: 7:31:53

Looks to me Ansible itself hanged for unknown reasons. There are many reports of folks experiencing similar issues: https://github.com/ansible/ansible/issues/30411

wes hayutin (weshayutin)
tags: added: alert
Changed in tripleo:
milestone: none → victoria-3
Changed in tripleo:
milestone: victoria-3 → wallaby-1
Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :
tags: added: promotion-blocker
wes hayutin (weshayutin)
Changed in tripleo:
importance: High → Critical
Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :
Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

tripleo-ci-centos-8-scenario010-standalone, tripleo-ci-centos-8-scenario010-ovn-provider-standalone-train and periodic-tripleo-ci-centos-8-scenario010-standalone-train are still failing with timeout in Octavia.

https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario010-ovn-provider-standalone-train/d97f3fd/logs/undercloud/home/zuul/standalone-ansible-ocifx5zb/octavia-ansible/octavia-ansible.log.txt.gz

Revision history for this message
Marios Andreou (marios-b) wrote :

There is some confusion around the status of this bug.

The patch at https://review.opendev.org/c/openstack/networking-ovn/+/796517 is NOT related to this fix - that is train only and it also points to a different bug @ https://bugs.launchpad.net/neutron/+bug/1932093

We haven't had any resolution to this and it is ongoing per Ananya comment 5 above

Changed in tripleo:
milestone: xena-1 → xena-2
Revision history for this message
Marios Andreou (marios-b) wrote :

This is an automated action. Bug status has been set to 'Incomplete' and target milestone has been removed due to inactivity. If you disagree please re-set these values and reach out to us on freenode #tripleo

Changed in tripleo:
milestone: xena-2 → none
status: Triaged → Incomplete
Revision history for this message
Brent Eagles (beagles) wrote :

A couple of thoughts about this bug:

- While both scenario 10, the ovn provider and non-ovn provider jobs are only similar in the deployment code - the running of tempest is quite different. I feel if we are going to track scenario 10 timeouts, it makes sense to track timeouts due to scenario 10 deployment and tempest separately.

- This was filed in 2020. It seems likely that the root cause for original issue has passed. Is the idea here to keep a log-like bug for recurring timeout issues that might be unrelated?

Revision history for this message
wes hayutin (weshayutin) wrote :

the bug is now marked incomplete, which is essentially closed. I agree in that it's good to reference old lp bugs with similiar characteristics, it makes more sense to open something new and point to the old bug.

There is no clear guidance or policy on that and milage may vary depending on who is involved.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.