Investigate tempest failures when using security groups

Bug #1784800 reported by Bernard Cafarelli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-sfc
New
Undecided
Unassigned

Bug Description

To fix tempest gates for Rocky, tests were updated to have port security disabled in https://bugs.launchpad.net/networking-sfc/+bug/1783997

But this is a workaround, we should investigate why tests with a wildcard security group stopped working in that cycle.

To reproduce, revert the networking_sfc/tests/tempest_plugin/tests/scenario changes from https://github.com/openstack/networking-sfc/commit/2000d47b57e093f7de844b22ff67555d7933bc55

Possible root causes: changes in security group defaults, switch to ovsfw?

Pasting information from previous bug:
Since around Rocky mid-cycle, the tempest gates always fail on all tests. Sample failure:
http://logs.openstack.org/05/575705/4/check/networking-sfc-tempest-dsvm/fefcd56/

VMs creation looks OK, but the test fails when trying to connect to a VM and run traceroute to the other:
2018-07-23 17:56:56.323 6755 INFO tempest.lib.common.ssh [-] Creating ssh connection to '172.24.5.20:22' as 'cirros' with public key authentication
2018-07-23 17:56:56.333 6755 INFO paramiko.transport [-] Connected (version 2.0, client dropbear_2012.55)
2018-07-23 17:56:56.607 6755 INFO paramiko.transport [-] Authentication (publickey) successful!
2018-07-23 17:56:56.608 6755 INFO tempest.lib.common.ssh [-] ssh connection to cirros@172.24.5.20 successfully created
2018-07-23 18:00:13.667 6755 ERROR tempest.lib.common.utils.linux.remote_client [-] (TestSfc:test_create_port_chain) Executing command on 172.24.5.20 failed. Error: Request timed out
Details: Command: 'set -eu -o pipefail; PATH=$PATH:/sbin; traceroute -n -I 10.1.0.13' executed on host '172.24.5.20'.: TimeoutException: Request timed out

After some digging I suspect some security group issue, as I deployed a master devstack and manually tested SFC, still working fine. But I disable port security in my manual tests
While tempest test is running, I made a quick test and run "openstack port set --disable-port-security --no-security-group" on all ports related to the test.

This allowed traceroute to finally report in:
    traceroute to 10.0.0.5 (10.0.0.5), 30 hops max, 46 byte packets
     1 * * *
     2 * * *
     3 * * *
     4 * * *
     5 * * *
     6 * 10.0.0.5 2.316 ms 1.935 ms

    2018-07-27 15:07:36,557 16774 ERROR [networking_sfc.tests.tempest_plugin.tests.scenario.base] length mismatch:
    [u' 1 * * *', u' 2 * * *', u' 3 * * *', u' 4 * * *', u' 5 * * *']
    vs
    [[u'10.0.0.8']]

The first '* * *' were timeouts until I disabled port security
Also tweaking the code to run with port security disabled, all tests pass

Tags: tempest
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.