Tempest tests fail in featureset020 periodic job fails

Bug #1732174 reported by Alfredo Moralejo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Triaged
Critical
Unassigned

Bug Description

Last executions of periodic job fs20 for master are failing when running tempest validation with similar issue [1].

Apparently, jobs can not reach instances created using floating IPs although they have been properly created.

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/21a7e9a/undercloud/home/jenkins/tempest_output.log.txt.gz

tags: added: ci promotion-blocker
Changed in tripleo:
importance: Undecided → Critical
milestone: none → queens-2
Revision history for this message
Attila Darazs (adarazs) wrote :

Some of these failures might be related to the wrong MTU size set in the RDO Cloud configs: https://bugs.launchpad.net/tripleo/+bug/1731988

Revision history for this message
Ronelle Landy (rlandy) wrote :

Checking a full tempest run with https://review.openstack.org/#/c/519486/ to see if it fixes the tempest failures.

Revision history for this message
Ronelle Landy (rlandy) wrote :

Last run was down to 1 failure - the same as Pike:
https://bugs.launchpad.net/tripleo/+bug/1731988

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

Yes, the same in subsequent runs. I'm closing this, however, do we know what fixed previous issue?

Revision history for this message
Attila Darazs (adarazs) wrote :

We have this 29 failed test both in the master and the pike promotion jobs in featureset020.

This is a very recent failure in pike promotion: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-pike/1c38af4/stackviz/#/stdin

Putting this to triaged to have it escalated.

Changed in tripleo:
status: New → Triaged
Revision history for this message
Ronelle Landy (rlandy) wrote :

The 29 failure result is sporadic - which is making it harder to debug.
The same job on rerun can hit one failure and then 29.

Revision history for this message
Attila Darazs (adarazs) wrote :

Looking at the timeline, it looks like tests start to fail in the second half of the tempest run:

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-pike/1c38af4/stackviz/#/stdin/timeline

Also all the tests I checked fail with various connection timeouts to VMs. Either the networking goes down in the later stages, or it actually doesn't work at all and these are the tests that require network connection to instances.

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

Note that we only hit this in featureset20, what is different from other ovb jobs running in periodic (feature002 and 024)?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.