paramiko.ssh_exception.NoValidConnectionsError error cause dvr scenario jobs failing

Bug #1858642 reported by Slawek Kaplonski
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Unassigned

Bug Description

Examples of failures:

https://01151d403671623b870a-df3a6d916dbcc54368929fa122458369.ssl.cf5.rackcdn.com/611605/16/check/neutron-tempest-plugin-dvr-multinode-scenario/3593a8b/testr_results.html.gz

https://8048a026d234f26326fc-ba67ad6475e26f6ef2d3d3155c5088fd.ssl.cf5.rackcdn.com/699465/12/check/neutron-tempest-plugin-dvr-multinode-scenario/b9ec275/testr_results.html.gz

https://5901a6854a362ccbd57a-e3afd52b12849b1e5612595129f4842a.ssl.cf5.rackcdn.com/699465/12/check/neutron-tempest-dvr-ha-multinode-full/b4c8823/testr_results.html.gz

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b70/700740/1/check/neutron-tempest-dvr-ha-multinode-full/b70792e/testr_results.html.gz

Error is like:

2019-12-24 02:08:49,727 2368 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'date': 'Tue, 24 Dec 2019 02:08:49 GMT', 'server': 'Apache/2.4.29 (Ubuntu)', 'content-type': 'application/json', 'x-compute-request-id': 'req-945aff07-a641-41e9-a295-d852ab529d70', 'content-length': '1242', 'openstack-api-version': 'volume 3.0', 'vary': 'OpenStack-API-Version', 'x-openstack-request-id': 'req-945aff07-a641-41e9-a295-d852ab529d70', 'connection': 'close', 'status': '200', 'content-location': 'https://10.208.227.14/volume/v3/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560'}
        Body: b'{"volume": {"id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "status": "in-use", "size": 1, "availability_zone": "nova", "created_at": "2019-12-24T02:08:23.000000", "updated_at": "2019-12-24T02:08:49.000000", "attachments": [{"id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "attachment_id": "e05c79a1-fe93-4bfc-b898-d79b9313d72c", "volume_id": "aff9994e-7ff6-4677-9367-2c72f6b81560", "server_id": "e22b7120-235f-4874-bdb7-ff2560d77285", "host_name": "ubuntu-bionic-rax-dfw-0013565078", "device": "/dev/vdb", "attached_at": "2019-12-24T02:08:48.000000"}], "name": "tempest-TaggedAttachmentsTest-volume-1436834790", "description": null, "volume_type": "lvmdriver-1", "snapshot_id": null, "source_volid": null, "metadata": {}, "links": [{"rel": "self", "href": "https://10.208.227.14/volume/v3/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560"}, {"rel": "bookmark", "href": "https://10.208.227.14/volume/9e0edee667054f7daed0a622812a3ea5/volumes/aff9994e-7ff6-4677-9367-2c72f6b81560"}], "user_id": "9db60dda849449948cd5a8a96bd3cdde", "bootable": "false", "encrypted": false, "replication_status": null, "consistencygroup_id": null, "multiattach": false, "os-vol-tenant-attr:tenant_id": "9e0edee667054f7daed0a622812a3ea5"}}'
2019-12-24 02:08:49,727 2368 INFO [tempest.common.waiters] volume aff9994e-7ff6-4677-9367-2c72f6b81560 reached in-use after waiting for 3.727491 seconds
2019-12-24 02:08:49,728 2368 INFO [tempest.api.compute.servers.test_device_tagging] Attempting to verify tagged devices in server e22b7120-235f-4874-bdb7-ff2560d77285 via the metadata service: http://169.254.169.254/openstack/latest/meta_data.json
2019-12-24 02:08:49,728 2368 DEBUG [tempest.lib.common.utils.linux.remote_client] Remote command: set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; curl -V
2019-12-24 02:08:49,728 2368 INFO [tempest.lib.common.ssh] Creating ssh connection to '172.24.5.72:22' as 'cirros' with public key authentication
2019-12-24 02:08:52,783 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 1. Retry after 2 seconds.
2019-12-24 02:08:55,851 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 2. Retry after 3 seconds.
2019-12-24 02:09:02,411 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 3. Retry after 4 seconds.
2019-12-24 02:09:09,995 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 4. Retry after 5 seconds.
2019-12-24 02:09:18,572 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 5. Retry after 6 seconds.
2019-12-24 02:09:28,139 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 6. Retry after 7 seconds.
2019-12-24 02:09:38,703 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 7. Retry after 8 seconds.
2019-12-24 02:09:50,283 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 8. Retry after 9 seconds.
2019-12-24 02:10:02,859 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 9. Retry after 10 seconds.
2019-12-24 02:10:16,428 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 10. Retry after 11 seconds.
2019-12-24 02:10:31,019 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 11. Retry after 12 seconds.
2019-12-24 02:10:46,603 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 12. Retry after 13 seconds.
2019-12-24 02:11:03,183 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 13. Retry after 14 seconds.
2019-12-24 02:11:20,747 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 14. Retry after 15 seconds.
2019-12-24 02:11:39,339 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 15. Retry after 16 seconds.
2019-12-24 02:11:58,927 2368 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 ([Errno None] Unable to connect to port 22 on 172.24.5.72). Number attempts: 16. Retry after 17 seconds.
2019-12-24 02:12:19,499 2368 ERROR [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.72 after 16 attempts
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh Traceback (most recent call last):
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh sock=proxy_chan)
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 368, in connect
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh raise NoValidConnectionsError(errors)
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.24.5.72
2019-12-24 02:12:19.499 2368 ERROR tempest.lib.common.ssh
2019-12-24 02:12:19,500 2368 DEBUG [tempest.lib.common.utils.linux.remote_client] Caller: TaggedAttachmentsTest:test_tagged_attachment. Timeout trying to ssh to server {'id': 'e22b7120-235f-4874-bdb7-ff2560d77285', 'name': 'tempest-device-tagging-server-1147932506', 'status': 'ACTIVE', 'tenant_id': '9e0edee667054f7daed0a622812a3ea5', 'user_id': '9db60dda849449948cd5a8a96bd3cdde', 'metadata': {}, 'hostId': '00401a3ec186753ee044fe97bb7628ba5d3525f193b1885994e671ce', 'image': {'id': '5ed09d98-e57a-4808-850c-8f8b34e986e4', 'links': [{'rel': 'bookmark', 'href': 'https://10.208.227.14/compute/images/5ed09d98-e57a-4808-850c-8f8b34e986e4'}]}, 'flavor': {'vcpus': 1, 'ram': 64, 'disk': 1, 'ephemeral': 0, 'swap': 0, 'original_name': 'm1.nano', 'extra_specs': {'hw_rng:allowed': 'True'}}, 'created': '2019-12-24T02:08:28Z', 'updated': '2019-12-24T02:08:36Z', 'addresses': {'tempest-TaggedAttachmentsTest-1568302134-network': [{'version': 4, 'addr': '10.1.0.8', 'OS-EXT-IPS:type': 'fixed', 'OS-EXT-IPS-MAC:mac_addr': 'fa:16:3e:6c:86:4a'}]}, 'accessIPv4': '', 'accessIPv6': '', 'links': [{'rel': 'self', 'href': 'https://10.208.227.14/compute/v2.1/servers/e22b7120-235f-4874-bdb7-ff2560d77285'}, {'rel': 'bookmark', 'href': 'https://10.208.227.14/compute/servers/e22b7120-235f-4874-bdb7-ff2560d77285'}], 'OS-DCF:diskConfig': 'MANUAL', 'progress': 0, 'OS-EXT-AZ:availability_zone': 'nova', 'config_drive': 'True', 'key_name': 'tempest-keypair-1994582027', 'OS-SRV-USG:launched_at': '2019-12-24T02:08:36.000000', 'OS-SRV-USG:terminated_at': None, 'security_groups': [{'name': 'tempest-securitygroup--1301842345'}], 'OS-EXT-STS:task_state': None, 'OS-EXT-STS:vm_state': 'active', 'OS-EXT-STS:power_state': 1, 'os-extended-volumes:volumes_attached': [], 'locked': False, 'description': None, 'tags': []}
2019-12-24 02:12:19,606 2368 INFO [tempest.lib.common.rest_client] Request (TaggedAttachmentsTest:test_tagged_attachment): 200 POST https://10.208.227.14/compute/v2.1/servers/e22b7120-235f-4874-bdb7-ff2560d77285/action 0.105s

And in many cases it seems that metadata worked fine for instance but SSH is not possible for some reason.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-tempest-plugin (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/702903

Revision history for this message
Pawel Kaminski (pawelkax) wrote :

I started observing this error for tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume.
This is random failure both on master and train release.
I started observing this error from 18 Januar

Revision history for this message
Pawel Kaminski (pawelkax) wrote :

Hi,

Reason of my failure was that during devstack installation wget is not properly downloading cirros-0.4.0. In downloaded file I get http message instead of cirros image.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-tempest-plugin (master)

Reviewed: https://review.opendev.org/702903
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=4849f00e7eb7f0da8a62f7f697e27770d9b89ec3
Submitter: Zuul
Branch: master

commit 4849f00e7eb7f0da8a62f7f697e27770d9b89ec3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Thu Jan 16 16:01:10 2020 +0000

    Add debugging information in case of SSH connection error

    In case of SSH error, new debugging information is added to the log:
    - The local routing table
    - The local ARP table

    Change-Id: Ibe83390402675bb7e7d3cc4909a8097de4b0f836
    Related-Bug: #1858642

Changed in neutron:
importance: High → Critical
tags: added: tempest
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Changed in neutron:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.