paramiko.ssh_exception.NoValidConnectionsError error cause dvr scenario jobs failing

Bug #1858642 reported by Slawek Kaplonski on 2020-01-07
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
High
Unassigned

Bug Description

Examples of failures:

https://01151d403671623b870a-df3a6d916dbcc54368929fa122458369.ssl.cf5.rackcdn.com/611605/16/check/neutron-tempest-plugin-dvr-multinode-scenario/3593a8b/testr_results.html.gz

https://8048a026d234f26326fc-ba67ad6475e26f6ef2d3d3155c5088fd.ssl.cf5.rackcdn.com/699465/12/check/neutron-tempest-plugin-dvr-multinode-scenario/b9ec275/testr_results.html.gz

https://5901a6854a362ccbd57a-e3afd52b12849b1e5612595129f4842a.ssl.cf5.rackcdn.com/699465/12/check/neutron-tempest-dvr-ha-multinode-full/b4c8823/testr_results.html.gz

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b70/700740/1/check/neutron-tempest-dvr-ha-multinode-full/b70792e/testr_results.html.gz

Error is like:

2019-12-24 02:08:49,727 2368 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'date': 'Tue, 24 Dec 2019 02:08:49 GMT', 'server': 'Apache/2.4.29 (Ubuntu)', 'content-type': 'application/json', 'x-compute-request-id': 'req-945aff07-a641-41e9-a295-d852ab529d70', 'content-length': '1242', 'openstack-api-version': 'volume 3.0', 'vary': 'OpenStack-API-Version', 'x-openstack-request-id': 'req-945aff07-a641-41e9-a295-d852ab529d70', 'connection': 'close', 'status': '200', 'content-location': 'https://10.208.227.14/volume/v3/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560'}
        Body: b'{"volume": {"id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "status": "in-use", "size": 1, "availability_zone": "nova", "created_at": "2019-12-24T02:08:23.000000", "updated_at": "2019-12-24T02:08:49.000000", "attachments": [{"id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "attachment_id": "e05c79a1-fe93-4bfc-b898-d79b9313d72c", "volume_id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "server_id": "e22b7120-235f-4874-bdb7-ff2560d77285", "host_name": "ubuntu-bionic-rax-dfw-0013565078", "device": "/dev/vdb", "attached_at": "2019-12-24T02:08:48.000000"}], "name": "tempest-TaggedAttachmentsTest-volume-1436834790", "description": null, "volume_type": "lvmdriver-1", "snapshot_id": null, "source_volid": null, "metadata": {}, "links": [{"rel": "self", "href": "https://10.208.227.14/volume/v3/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560"}, {"rel": "bookmark", "href": "https://10.208.227.14/volume/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560"}], "user_id": "9db60dda849449948cd5a8a96bd3cdde", "bootable": "false", "encrypted": false, "replication_status": null, "consistencygroup_id": null, "multiattach": false, "os-vol-tenant-attr:tenant_id": "9e0edee667054f7daed0a622812a3ea5"}}'
2019-12-24 02:08:49,727 2368 INFO [tempest.common.waiters] volume aff9994e-7ff6-4677-9367-2c72f6b81560 reached in-use after waiting for 3.727491 seconds
2019-12-24 02:08:49,728 2368 INFO [tempest.api.compute.servers.test_device_tagging] Attempting to verify tagged devices in server e22b7120-235f-4874-bdb7-ff2560d77285 via the metadata service: http://169.254.169.254/openstack/latest/meta_data.json
2019-12-24 02:08:49,728 2368 DEBUG [tempest.lib.common.utils.linux.remote_client] Remote command: set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; curl -V
2019-12-24 02:08:49,728 2368 INFO [tempest.lib.common.ssh] Creating ssh connection to '172.24.5.72:22' as 'cirros' with public key authentication
2019-12-24 02:08:52,783 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 1. Retry after 2 seconds.
2019-12-24 02:08:55,851 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 2. Retry after 3 seconds.
2019-12-24 02:09:02,411 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 3. Retry after 4 seconds.
2019-12-24 02:09:09,995 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 4. Retry after 5 seconds.
2019-12-24 02:09:18,572 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 5. Retry after 6 seconds.
2019-12-24 02:09:28,139 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 6. Retry after 7 seconds.
2019-12-24 02:09:38,703 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 7. Retry after 8 seconds.
2019-12-24 02:09:50,283 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 8. Retry after 9 seconds.
2019-12-24 02:10:02,859 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 9. Retry after 10 seconds.
2019-12-24 02:10:16,428 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 10. Retry after 11 seconds.
2019-12-24 02:10:31,019 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 11. Retry after 12 seconds.
2019-12-24 02:10:46,603 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 12. Retry after 13 seconds.
2019-12-24 02:11:03,183 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 13. Retry after 14 seconds.
2019-12-24 02:11:20,747 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 14. Retry after 15 seconds.
2019-12-24 02:11:39,339 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 15. Retry after 16 seconds.
2019-12-24 02:11:58,927 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 16. Retry after 17 seconds.
2019-12-24 02:12:19,499 2368 ERROR [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 after 16 attempts
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh Traceback (most recent call last):
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh sock=proxy_chan)
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 368, in connect
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh raise NoValidConnectionsError(errors)
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.24.5.72
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh
2019-12-24 02:12:19,500 2368 DEBUG [tempest.lib.common.utils.linux.remote_client] Caller: TaggedAttachmentsTest:test_tagged_attachment. Timeout trying to ssh to server {'id': 'e22b7120-235f-4874-bdb7-ff2560d77285', 'name': 'tempest-device-tagging-server-1147932506', 'status': 'ACTIVE', 'tenant_id': '9e0edee667054f7daed0a622812a3ea5', 'user_id': '9db60dda849449948cd5a8a96bd3cdde', 'metadata': {}, 'hostId': '00401a3ec186753ee044fe97bb7628ba5d3525f193b1885994e671ce', 'image': {'id': '5ed09d98-e57a-4808-850c-8f8b34e986e4', 'links': [{'rel': 'bookmark', 'href': 'https://10.208.227.14/compute/images/5ed09d98-e57a-4808-850c-8f8b34e986e4'}]}, 'flavor': {'vcpus': 1, 'ram': 64, 'disk': 1, 'ephemeral': 0, 'swap': 0, 'original_name': 'm1.nano', 'extra_specs': {'hw_rng:allowed': 'True'}}, 'created': '2019-12-24T02:08:28Z', 'updated': '2019-12-24T02:08:36Z', 'addresses': {'tempest-TaggedAttachmentsTest-1568302134-network': [{'version': 4, 'addr': '10.1.0.8', 'OS-EXT-IPS:type': 'fixed', 'OS-EXT-IPS-MAC:mac_addr': 'fa:16:3e:6c:86:4a'}]}, 'accessIPv4': '', 'accessIPv6': '', 'links': [{'rel': 'self', 'href': 'https://10.208.227.14/compute/v2.1/servers/e22b7120-235f-4874-bdb7-ff2560d77285'}, {'rel': 'bookmark', 'href': 'https://10.208.227.14/compute/servers/e22b7120-235f-4874-bdb7-ff2560d77285'}], 'OS-DCF:diskConfig': 'MANUAL', 'progress': 0, 'OS-EXT-AZ:availability_zone': 'nova', 'config_drive': 'True', 'key_name': 'tempest-keypair-1994582027', 'OS-SRV-USG:launched_at': '2019-12-24T02:08:36.000000', 'OS-SRV-USG:terminated_at': None, 'security_groups': [{'name': 'tempest-securitygroup--1301842345'}], 'OS-EXT-STS:task_state': None, 'OS-EXT-STS:vm_state': 'active', 'OS-EXT-STS:power_state': 1, 'os-extended-volumes:volumes_attached': [], 'locked': False, 'description': None, 'tags': []}
2019-12-24 02:12:19,606 2368 INFO [tempest.lib.common.rest_client] Request (TaggedAttachmentsTest:test_tagged_attachment): 200 POST https://10.208.227.14/compute/v2.1/servers/e22b7120-235f-4874-bdb7-ff2560d77285/action 0.105s

And in many cases it seems that metadata worked fine for instance but SSH is not possible for some reason.

Pawel Kaminski (pawelkax) wrote :

I started observing this error for tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume.
This is random failure both on master and train release.
I started observing this error from 18 Januar

Pawel Kaminski (pawelkax) wrote :

Hi,

Reason of my failure was that during devstack installation wget is not properly downloading cirros-0.4.0. In downloaded file I get http message instead of cirros image.

Reviewed: https://review.opendev.org/702903
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=4849f00e7eb7f0da8a62f7f697e27770d9b89ec3
Submitter: Zuul
Branch: master

commit 4849f00e7eb7f0da8a62f7f697e27770d9b89ec3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Thu Jan 16 16:01:10 2020 +0000

    Add debugging information in case of SSH connection error

    In case of SSH error, new debugging information is added to the log:
    - The local routing table
    - The local ARP table

    Change-Id: Ibe83390402675bb7e7d3cc4909a8097de4b0f836
    Related-Bug: #1858642

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers