NetworkSecGroupTest failing on fs020 stein/train

Bug #1878248 reported by Rafael Folco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Rafael Folco

Bug Description

https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-stein&job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train#

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-stein/68f40d9/logs/tempest.html.gz

neutron_tempest_plugin.scenario.test_security_groups.NetworkSecGroupTest
test_multiple_ports_portrange_remote[id-f07d0159-8f9e-4faa-87f5-a869ab0ad489] fail

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/neutron_tempest_plugin/scenario/test_security_groups.py", line 398, in test_multiple_ports_portrange_remote
    test_ip, port)
  File "/usr/lib/python2.7/site-packages/neutron_tempest_plugin/scenario/test_security_groups.py", line 60, in _verify_http_connection
    raise e
neutron_tempest_plugin.common.utils.SSHExecCommandFailed: Command u'curl http://10.100.0.10:80 --retry 3 --connect-timeout 2' failed, exit status: 28, stderr:
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed

  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
  0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0Warning: Transient problem: timeout Will retry in 1 seconds. 3 retries left.

  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
  0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0Warning: Transient problem: timeout Will retry in 2 seconds. 2 retries left.

  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
  0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0Warning: Transient problem: timeout Will retry in 4 seconds. 1 retries left.

  0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
  0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0curl: (28) connect() timed out!

stdout:

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)

Fix proposed to branch: master
Review: https://review.opendev.org/727287

Changed in tripleo:
status: Triaged → In Progress
wes hayutin (weshayutin)
tags: added: promotion-blocker
removed: alert ci
Changed in tripleo:
assignee: Rafael Folco (rafaelfolco) → wes hayutin (weshayutin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/727287
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=790fd4a81ab083ac7e3307351bd1a8a38a7e7dfe
Submitter: Zuul
Branch: master

commit 790fd4a81ab083ac7e3307351bd1a8a38a7e7dfe
Author: Rafael Folco <email address hidden>
Date: Tue May 12 14:44:31 2020 -0300

    [skiplist] NetworkSecGroupTest timeout

    NetworkSecGroupTest timeout failures on stein.

    Partial-Bug: #1878248

    https://bugs.launchpad.net/tripleo/+bug/1878248

    Change-Id: I37592ca0f836d725570272c9a9020122bf83c9a7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)

Fix proposed to branch: master
Review: https://review.opendev.org/729594

Changed in tripleo:
assignee: wes hayutin (weshayutin) → Rafael Folco (rafaelfolco)
Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

We have blacklisted this test in networking-ovn stable/stein:

https://review.opendev.org/#/c/710710/

The reason why is the fix for this bug is included in Core-OVN [1]
It has been released under OVN 2.12.

Looks like the stein/train TripleO gates doesn't have it included.

[1] https://patchwork.ozlabs.org/patch/1185708/

Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

Since stable/stein uses OVN 2.11, we need to disable that test for Stein.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/729594
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=5a7f69b1a3f2501c74c6530c4f4d2bb5a60e3d43
Submitter: Zuul
Branch: master

commit 5a7f69b1a3f2501c74c6530c4f4d2bb5a60e3d43
Author: Rafael Folco <email address hidden>
Date: Wed May 20 10:13:41 2020 -0300

    [skiplist] NetworkSecGroupTest timeout (train)

    NetworkSecGroupTest timeout failures on train, producing
    intermittent results.

    To be investigated. Blocking train promotions on fs020.

    Partial-Bug: #1878248

    https://bugs.launchpad.net/tripleo/+bug/1878248

    Change-Id: I107f169880c5bebeeeb61def0c4aadba4f8fa51a

Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

For stable/train random failures might be related to the bug:
https://github.com/cirros-dev/cirros/issues/8

While we spawn more VMs during the test we found that sometimes cirros can't get public ssh keys and fails without retrying.
It ends with test failure that the authentication failed.

I added the retry to Cirros ec2metadata script and it has been recently released. We should try to change Cirros to 0.5.1 for devstack and in tripleo tests in order to prevent this situation to happen again.

https://github.com/cirros-dev/cirros/tree/0.5.1

Taken from stable/train failure:
2020-05-20 09:25:23,053 346350 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@10.0.0.115 (Authentication failed.). Number attempts: 21. Retry after 22 seconds.
2020-05-20 09:25:45,598 346350 INFO [paramiko.transport] Connected (version 2.0, client dropbear_2015.67)
2020-05-20 09:25:45,841 346350 INFO [paramiko.transport] Authentication (publickey) failed.
2020-05-20 09:25:45,973 346350 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@10.0.0.115 (Authentication failed.). Number attempts: 22. Retry after 23 seconds.

wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)

Fix proposed to branch: master
Review: https://review.opendev.org/733170

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart (master)

Fix proposed to branch: master
Review: https://review.opendev.org/733634

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart (master)

Change abandoned by Rafael Folco (<email address hidden>) on branch: master
Review: https://review.opendev.org/733634
Reason: https://review.opendev.org/733676

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart-extras (master)

Change abandoned by Rafael Folco (<email address hidden>) on branch: master
Review: https://review.opendev.org/733170

Revision history for this message
Emilien Macchi (emilienm) wrote :

https://review.opendev.org/#/c/733676/ seems to be the current fix

Revision history for this message
Ronelle Landy (rlandy) wrote :

Closing this out - stein has been running green.

https://review.opendev.org/#/c/733676/3 should so address this

Changed in tripleo:
status: In Progress → Fix Released
Changed in tripleo:
status: Fix Released → Triaged
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.