TestNetworkBasicOps.test_connectivity_between_vms_on_different_networks in master fails on auth

Bug #1921400 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master/15dae5e/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz

Failed 1 tests - output below:
==============================

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_connectivity_between_vms_on_different_networks[compute,id-1546850e-fbaa-42f5-8b5f-03d8a6a95f15,network,slow]
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 113, in _get_ssh_connection
        sock=proxy_chan)
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 446, in connect
        passphrase,
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 764, in _auth
        raise saved_exception
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 664, in _auth
        self._transport.auth_publickey(username, pkey)
      File "/usr/lib/python3.6/site-packages/paramiko/transport.py", line 1580, in auth_publickey
        return self.auth_handler.wait_for_response(my_event)
      File "/usr/lib/python3.6/site-packages/paramiko/auth_handler.py", line 250, in wait_for_response
        raise e
    paramiko.ssh_exception.AuthenticationException: Authentication failed.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
        return f(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 490, in test_connectivity_between_vms_on_different_networks
        self._check_public_network_connectivity(should_connect=True)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 213, in _check_public_network_connectivity
        message, server, mtu=mtu)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 868, in check_vm_connectivity
        server=server)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 638, in get_remote_client
        linux_client.validate_authentication()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 59, in wrapper
        six.reraise(*original_exception)
      File "/usr/local/lib/python3.6/site-packages/six.py", line 703, in reraise
        raise value
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 32, in wrapper
        return function(self, *args, **kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 115, in validate_authentication
        self.ssh_client.test_connection_auth()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 217, in test_connection_auth
        connection = self._get_ssh_connection()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 129, in _get_ssh_connection
        password=self.password)
    tempest.lib.exceptions.SSHTimeout: Connection to the 192.168.24.125 via SSH timed out.
    User: cirros, Password: None

Detailed neutron trace here:
https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master/15dae5e/logs/undercloud/var/log/extra/errors.txt.txt.gz

Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Download full text (7.2 KiB)

This seems like some error in the ovn-metadata agent:

2021-03-25 10:10:29.671 166886 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: PortBindingChassisCreatedEvent(events=('update',), table='Port_Binding', conditions=None, old_conditions=None) to row=Port_Binding(parent_port=[], chassis=[<ovs.db.idl.Row object at 0x7f64bb6545f8>], mac=['fa:16:3e:10:ba:5a 10.100.0.6'], options={'requested-chassis': 'standalone.localdomain'}, ha_chassis_group=[], type=, tag=[], tunnel_key=3, up=[False], logical_port=ca3ab19f-a727-43c3-b221-b3b044c41a39, gateway_chassis=[], encap=[], external_ids={'neutron:cidrs': '10.100.0.6/28', 'neutron:device_id': 'b71c3364-8c5c-43d5-b964-e00dc0fc55d0', 'neutron:device_owner': 'compute:nova', 'neutron:network_name': 'neutron-35b32837-4e1b-440f-828b-5b4adc6e4e3e', 'neutron:port_name': '', 'neutron:project_id': 'dc4b3a8ed09a49cda03b4d63440afcc2', 'neutron:revision_number': '2', 'neutron:security_group_ids': 'acf07f3a-170f-492f-b2ce-e45055db1593'}, virtual_parent=[], nat_addresses=[], datapath=da7baee1-4bbb-4673-82f4-c2b6085efe68) old=Port_Binding(chassis=[]) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2021-03-25 10:10:29.673 166886 INFO neutron.agent.ovn.metadata.agent [-] Port ca3ab19f-a727-43c3-b221-b3b044c41a39 in datapath 35b32837-4e1b-440f-828b-5b4adc6e4e3e bound to our chassis
2021-03-25 10:10:29.678 166886 DEBUG neutron.agent.ovn.metadata.agent [-] Provisioning metadata for network 35b32837-4e1b-440f-828b-5b4adc6e4e3e provision_datapath /usr/lib/python3.6/site-packages/neutron/agent/ovn/metadata/agent.py:411
2021-03-25 10:10:29.706 167606 DEBUG oslo.privsep.daemon [-] privsep: reply[140070617461112]: (4, False) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-03-25 10:10:29.707 166886 DEBUG neutron.agent.ovn.metadata.agent [-] Creating VETH tap35b32837-41 in ovnmeta-35b32837-4e1b-440f-828b-5b4adc6e4e3e namespace provision_datapath /usr/lib/python3.6/site-packages/neutron/agent/ovn/metadata/agent.py:452
2021-03-25 10:10:29.711 167606 DEBUG neutron.privileged.agent.linux.ip_lib [-] Interface tap35b32837-40 not found in namespace None get_link_id /usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py:251
2021-03-25 10:10:29.711 167606 DEBUG oslo.privsep.daemon [-] privsep: reply[140070617461112]: (4, False) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-03-25 10:10:29.715 167606 DEBUG oslo.privsep.daemon [-] privsep: reply[140070617461112]: (4, ['ovnmeta-0aac5bc0-5bd1-473f-b387-96ad368de639', 'ovnmeta-413a30b5-1611-4783-a10e-266169276ee8', 'ovnmeta-9412af3f-dd39-4651-b208-f22de81d7904']) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-03-25 10:10:29.736 167606 DEBUG oslo.privsep.daemon [-] privsep: reply[140070617461112]: (4, None) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-03-25 10:10:29.775 167606 DEBUG oslo.privsep.daemon [-] privsep: reply[140070617461112]: (4, ('net.ipv4.conf.all.promote_secondaries = 1\n', '', 0)) _call_back /usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-03-25 10:10:29.810 167606 DEBUG oslo.privsep.daemon ...

Read more...

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Looking at elastic search https://review.rdoproject.org/analytics/goto/54254388c69c23ec988de63d8a0acd31 it seems that same issue happens from time to time in the nova compute (libvirt) when it tries to create interface. And it seems like it started somewhere around 25.03.2021.
Maybe this is some netlink/selinux issue?

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

I don't think it's linked to SELinux, since CI is in permissive... Though there are a lot of denials:
https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-full-tempest-scenario-master/15dae5e/logs/undercloud/var/log/extra/denials.txt.gz

but they are linked to lsof (probably the healthchecks... meh).

Revision history for this message
David Vallee Delisle (valleedelisle) wrote :

While most of the lsof denials have been flagged in permissive, there's a denial that is not in permissive during the install process:

----
time->Wed Apr 7 12:01:17 2021
type=PROCTITLE msg=audit(1617811277.064:1315): proctitle=696E7374616C6C002D64002D6D00373535002D6F006F70656E76737769746368002D670068756765746C626673002F7661722F72756E2F6F70656E76737769746368
type=SYSCALL msg=audit(1617811277.064:1315): arch=c000003e syscall=91 success=yes exit=0 a0=3 a1=1ed a2=0 a3=0 items=0 ppid=6561 pid=6752 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="install" exe="/usr/bin/install" subj=system_u:system_r:openvswitch_t:s0 key=(null)
type=AVC msg=audit(1617811277.064:1315): avc: denied { fsetid } for pid=6752 comm="install" capability=4 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=capability permissive=0
type=AVC msg=audit(1617811277.064:1315): avc: denied { fsetid } for pid=6752 comm="install" capability=4 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=capability permissive=0

I'm still looking but I think this is a lead.

Revision history for this message
chandan kumar (chkumar246) wrote :

We are seeing it in upgrade job:
https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2da/785195/1/check/tripleo-ci-centos-8-standalone-upgrade-victoria/2da7930/logs/undercloud/var/log/tempest/tempest_run.log
```
{0} tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario [367.974139s] ... FAILED

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 113, in _get_ssh_connection
        sock=proxy_chan)
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 446, in connect
        passphrase,
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 764, in _auth
        raise saved_exception
      File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 664, in _auth
        self._transport.auth_publickey(username, pkey)
      File "/usr/lib/python3.6/site-packages/paramiko/transport.py", line 1580, in auth_publickey
        return self.auth_handler.wait_for_response(my_event)
      File "/usr/lib/python3.6/site-packages/paramiko/auth_handler.py", line 250, in wait_for_response
        raise e
    paramiko.ssh_exception.AuthenticationException: Authentication failed.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 90, in wrapper
        return f(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_minimum_basic.py", line 151, in test_minimum_basic_scenario
        server=server)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 638, in get_remote_client
        linux_client.validate_authentication()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 59, in wrapper
        six.reraise(*original_exception)
      File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
        raise value
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 32, in wrapper
        return function(self, *args, **kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 115, in validate_authentication
        self.ssh_client.test_connection_auth()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 217, in test_connection_auth
        connection = self._get_ssh_connection()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 129, in _get_ssh_connection
        password=self.password)
    tempest.lib.exceptions.SSHTimeout: Connection to the 192.168.24.117 via SSH timed out.
    User: cirros, Password: None
```

Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Revision history for this message
wes hayutin (weshayutin) wrote :

test fails in master:

}}}

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 458, in test_mtu_sized_frames
    should_connect=True, mtu=self.network['mtu'])
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 214, in _check_public_network_connectivity
    message, server, mtu=mtu)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 950, in check_vm_connectivity
    mtu=mtu, server=server),
  File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 913, in ping_ip_address
    self.log_console_output([server])
  File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 789, in log_console_output
    server['id'], **kwargs)['output']
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/compute/servers_client.py", line 644, in get_console_output
    schema.get_console_output, **kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/compute/servers_client.py", line 216, in action
    post_body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 299, in post
    return self.request('POST', url, extra_headers, headers, body, chunked)
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/compute/base_compute_client.py", line 48, in request
    method, url, extra_headers, headers, body, chunked)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 703, in request
    self._error_checker(resp, resp_body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 880, in _error_checker
    message=message)
tempest.lib.exceptions.ServerFault: Got server fault
Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.NovaException'>

https://e50a3f77a7aa536f5fb8-4891f56aa8930b73f436c9ec6753dd80.ssl.cf1.rackcdn.com/795005/1/gate/tripleo-ci-centos-8-containers-multinode/3a93dee/logs/undercloud/var/log/tempest/stestr_results.html

Changed in tripleo:
milestone: xena-1 → xena-2
Changed in tripleo:
milestone: xena-2 → xena-3
Revision history for this message
Ronelle Landy (rlandy) wrote :

https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-containers-multinode is passing - no trace of bug or test in skiplist - closing this out

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.