tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_port_security_macspoofing_port fails times out upstream infra, not in rdo infra

Bug #1936420 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Triaged
Critical
Kamil Sambor

Bug Description

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_444/800859/1/gate/tripleo-ci-centos-8-containers-multinode/4445c5b/logs/undercloud/var/log/tempest/stestr_results.html

https://416cdf69510d61e0d92e-41ddd4f23837a418873568ca0bb76101.ssl.cf1.rackcdn.com/800859/1/gate/tripleo-ci-centos-7-containers-multinode/b28fdd3/logs/undercloud/var/log/tempest/stestr_results.html

1 packets transmitted, 0 packets received, 100% packet loss

2021-07-15 17:02:31,609 136917 WARNING [tempest.scenario.manager] Failed to check icmp connectivity for IP 10.100.0.18 via a ssh connection from: 192.168.24.154.
2021-07-15 17:02:32,610 136917 DEBUG [tempest.lib.common.utils.linux.remote_client] Remote command: set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; sudo ping -I eth1 -c1 -w1 -s56 10.100.0.18
2021-07-15 17:02:32,611 136917 INFO [tempest.lib.common.ssh] Creating ssh connection to '192.168.24.154:22' as 'cirros' with public key authentication
2021-07-15 17:02:32,668 136917 INFO [paramiko.transport] Connected (version 2.0, client dropbear_2018.76)
2021-07-15 17:02:32,911 136917 INFO [paramiko.transport] Authentication (publickey) successful!
2021-07-15 17:02:32,917 136917 INFO [tempest.lib.common.ssh] ssh connection to cirros@192.168.24.154 successfully created
2021-07-15 17:02:43,098 136917 ERROR [tempest.lib.common.utils.linux.remote_client] (TestNetworkBasicOps:test_port_security_macspoofing_port) Executing command on 192.168.24.154 failed. Error: Command 'set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; sudo ping -I eth1 -c1 -w1 -s56 10.100.0.18' failed, exit status: 1, stderr:

stdout:
PING 10.100.0.18 (10.100.0.18): 56 data bytes

AssertionError: Timed out waiting for 10.100.0.18 to become reachable from 192.168.24.154

Possibly a reason to perm skip this test.

Tags: alert
Revision history for this message
Bhagyashri Shewale (bhagyashri-shewale) wrote :
Revision history for this message
Bhagyashri Shewale (bhagyashri-shewale) wrote :
Revision history for this message
Bhagyashri Shewale (bhagyashri-shewale) wrote :
Revision history for this message
chandan kumar (chkumar246) wrote :

Against this patch https://review.opendev.org/c/openstack/tripleo-heat-templates/+/799981 and
https://b18b498c87a70760171a-55be5aa52ff9f3c19ea7817c824a7dd8.ssl.cf5.rackcdn.com/799981/1/check/tripleo-ci-centos-8-standalone/d4c3fa4/logs/undercloud/var/log/tempest/tempest_run.log

```
{0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_port_security_macspoofing_port [255.406850s] ... FAILED

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 109, in wrapper
        return func(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 90, in wrapper
        return f(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 903, in test_port_security_macspoofing_port
        nic=spoof_nic, should_succeed=True)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 1387, in check_remote_connectivity
        self.fail(msg)
      File "/usr/lib/python3.6/site-packages/unittest2/case.py", line 693, in fail
        raise self.failureException(msg)
    AssertionError: Timed out waiting for 10.100.0.21 to become reachable from 192.168.24.173
```

We are seeing the same issue moving this job to skip list.

Changed in tripleo:
milestone: xena-rc1 → none
milestone: none → xena-2
Revision history for this message
Bhagyashri Shewale (bhagyashri-shewale) wrote (last edit ):
Changed in tripleo:
milestone: xena-2 → xena-3
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :
tags: added: promotion-blocker
Revision history for this message
Marios Andreou (marios-b) wrote :
Download full text (3.2 KiB)

did some digging here . on the multinode jobs there is a pretty clear error from the dhcp-agent logs [1][2] "Device tapc0c2f8e7-d3 cannot be used as it has no MAC address"

 2021-07-26 13:20:24.116 126380 DEBUG neutron.agent.linux.utils [req-ac28e47a-1e4e-4ef3-b310-9f17f93b9c95 - - - - -] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qdhcp-2dd0cdd7-df67-4ee5-910b-bfa09ee0fc55', 'sysctl', '-w', 'net.ipv4.ip_nonlocal_bind=1'] execute_rootwrap_daemon /usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py:104
 2021-07-26 13:20:24.149 126380 DEBUG neutron.agent.linux.utils [req-ac28e47a-1e4e-4ef3-b310-9f17f93b9c95 - - - - -] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qdhcp-2dd0cdd7-df67-4ee5-910b-bfa09ee0fc55', 'sysctl', '-w', 'net.ipv6.conf.default.accept_ra=0'] execute_rootwrap_daemon /usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py:104
 2021-07-26 13:20:24.198 139823 DEBUG neutron.privileged.agent.linux.ip_lib [-] Interface tapc0c2f8e7-d3 not found in namespace qdhcp-2dd0cdd7-df67-4ee5-910b-bfa09ee0fc55 get_link_id /usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py:290
 2021-07-26 13:20:24.198 139823 DEBUG oslo.privsep.daemon [-] privsep: reply[139750036169080]: (4, False) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
 2021-07-26 13:20:24.200 126380 ERROR neutron.agent.linux.ip_lib [req-ac28e47a-1e4e-4ef3-b310-9f17f93b9c95 - - - - -] Device tapc0c2f8e7-d3 cannot be used as it has no MAC address

On the standalone jobs there is no dhcp-agent container so we don't have that same error, however there are different errors in neutron server.log [3]

        2021-07-19 02:08:49.050 15 ERROR networking_ovn.ovsdb.ovsdb_monitor [-] HashRing is empty, error: Hash Ring returned empty when hashing "b'908a8d0b-3423-4817-bc0b-384f3f8941fc'". This should never happen in a normal situation, please check the status of your cluster: networking_ovn.common.exceptions.HashRingIsEmpty: Hash Ring returned empty when hashing "b'908a8d0b-3423-4817-bc0b-384f3f8941fc'". This should never happen in a normal situation, please check the status of your cluster
...
       2021-07-19 02:16:14.759 15 INFO neutron.services.segments.plugin [-] Segment 4c047c5c-c230-4c8f-98ed-b5b15a5c2e90 resource provider not found; error: Placement Client Error (4xx): {"errors": [{"status": 404, "title": "Not Found", "detail": "The resource could not be found.\n\n No resource provider with uuid 4c047c5c-c230-4c8f-98ed-b5b15a5c2e90 found for delete ", "request_id": "req-ac607e4b-5e09-4ea0-90cf-eebcc2dcbfd3"}]}

[1] https://logserver.rdoproject.org/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-containers-multinode-network-train/794b86d/logs/undercloud/var/log/extra/errors.txt.gz
[2] https://logserver.rdoproject.org/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-containers-multinode-network-train/794b86d/logs/undercloud/var/log/containers/neutron/dhcp-agent.log.txt.gz
[3] https://b18b498c87a70760171a-55be5aa52ff9f3c19ea7817c824a7dd8.ssl.cf5.rackcdn.com/799981/1/check/tripleo-ci-centos-8-standal...

Read more...

Revision history for this message
Marios Andreou (marios-b) wrote :
Kamil Sambor (ksambor)
Changed in tripleo:
assignee: nobody → Kamil Sambor (ksambor)
Revision history for this message
Marios Andreou (marios-b) wrote (last edit ):

@Kamil is working on this

        * Test reverts for train https://review.rdoproject.org/r/c/testproject/+/34849/
        * Depends-On: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/803434
        * Depends-On: https://review.opendev.org/c/openstack/networking-ovn/+/803433

Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I think I found why this test if failing in Train jobs. In train we are using ovn-2.12.0-10.el8 and there was bug in ovn which were causing failure of that macspoofing test. Bug was fixed by https://patchwork.ozlabs<email address hidden>/ which is in ovn 20.03 IIRC. It's for sure not in ovn 2.12 which we have in Train jobs.
Thus I think that this test should still be skipped as it is done by https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/802527

I also think that this was skipped before and patch https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/800913 which was merged 15th of July unskipped it :) That's why it started to failing that day :)

Based on all of that above, I think that this bug can be closed now as test is skipped.

Ronelle Landy (rlandy)
tags: removed: promotion-blocker
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.