SSH timed out in three tempest tests

Bug #1645709 reported by Guy Rozendorn
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Expired
Undecided
Unassigned

Bug Description

Hi,

We recently set up a CI environment for out INFINIDAT volume driver for cinder.
We have two slaves running the test suite, and on each one the following tests fail consistently:

tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario
tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_volume_boot_pattern
tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_volume_boot_pattern

As you can see in the logs (see link below), they all fail due to SSH timed out:

06:17:14 Captured traceback:
06:17:14 ~~~~~~~~~~~~~~~~~~~
06:17:14 Traceback (most recent call last):
06:17:14 File "tempest/test.py", line 100, in wrapper
06:17:14 return f(self, *func_args, **func_kwargs)
06:17:14 File "tempest/scenario/test_minimum_basic.py", line 139, in test_minimum_basic_scenario
06:17:14 floating_ip['ip'], private_key=keypair['private_key'])
06:17:14 File "tempest/scenario/manager.py", line 331, in get_remote_client
06:17:14 linux_client.validate_authentication()
06:17:14 File "tempest/common/utils/linux/remote_client.py", line 54, in wrapper
06:17:14 six.reraise(*original_exception)
06:17:14 File "tempest/common/utils/linux/remote_client.py", line 35, in wrapper
06:17:14 return function(self, *args, **kwargs)
06:17:14 File "tempest/common/utils/linux/remote_client.py", line 99, in validate_authentication
06:17:14 self.ssh_client.test_connection_auth()
06:17:14 File "tempest/lib/common/ssh.py", line 176, in test_connection_auth
06:17:14 connection = self._get_ssh_connection()
06:17:14 File "tempest/lib/common/ssh.py", line 90, in _get_ssh_connection
06:17:14 password=self.password)
06:17:14 tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.10 via SSH timed out.
06:17:15 User: cirros, Password: None

Down the log you can see that the virtual machine is up and running but the networking doesn't work:
06:17:15 === network info ===
06:17:15 if-info: lo,up,127.0.0.1,8,::1
06:17:15 if-info: eth0,up,10.1.0.6,28,fe80::f816:3eff:fea6:d5a0
06:17:15 ip-route:default via 10.1.0.1 dev eth0
06:17:15 ip-route:10.1.0.0/28 dev eth0 src 10.1.0.6
06:17:15 ip-route:169.254.169.254 via 10.1.0.1 dev eth0
06:17:15 === datasource: None None ===
06:17:15 === cirros: current=0.3.4 uptime=101.89 ===
06:17:15 === pinging gateway failed, debugging connection ===
06:17:15 ############ debug start ##############
06:17:15 ### /etc/init.d/sshd start
06:17:15 Starting dropbear sshd: OK
06:17:15 ### ifconfig -a
06:17:15 eth0 Link encap:Ethernet HWaddr FA:16:3E:A6:D5:A0
06:17:15 inet addr:10.1.0.6 Bcast:10.1.0.15 Mask:255.255.255.240
06:17:15 inet6 addr: fe80::f816:3eff:fea6:d5a0/64 Scope:Link
06:17:15 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
06:17:15 RX packets:22 errors:0 dropped:0 overruns:0 frame:0
06:17:15 TX packets:72 errors:0 dropped:0 overruns:0 carrier:0
06:17:15 collisions:0 txqueuelen:1000
06:17:15 RX bytes:2587 (2.5 KiB) TX bytes:3875 (3.7 KiB)
06:17:15
06:17:15 lo Link encap:Local Loopback
06:17:15 inet addr:127.0.0.1 Mask:255.0.0.0
06:17:15 inet6 addr: ::1/128 Scope:Host
06:17:15 UP LOOPBACK RUNNING MTU:16436 Metric:1
06:17:15 RX packets:41 errors:0 dropped:0 overruns:0 frame:0
06:17:15 TX packets:41 errors:0 dropped:0 overruns:0 carrier:0
06:17:15 collisions:0 txqueuelen:0
06:17:15 RX bytes:3632 (3.5 KiB) TX bytes:3632 (3.5 KiB)
06:17:15
06:17:15 ### route -n
06:17:15 Kernel IP routing table
06:17:15 Destination Gateway Genmask Flags Metric Ref Use Iface
06:17:15 0.0.0.0 10.1.0.1 0.0.0.0 UG 0 0 0 eth0
06:17:15 10.1.0.0 0.0.0.0 255.255.255.240 U 0 0 0 eth0
06:17:15 169.254.169.254 10.1.0.1 255.255.255.255 UGH 0 0 0 eth0
06:17:15 ### cat /etc/resolv.conf
06:17:15 search openstacklocal
06:17:15 nameserver 10.1.0.2
06:17:15 ### ping -c 5 10.1.0.1
06:17:15 PING 10.1.0.1 (10.1.0.1): 56 data bytes
06:17:15
06:17:15 --- 10.1.0.1 ping statistics ---
06:17:15 5 packets transmitted, 0 packets received, 100% packet loss
06:17:15 ### pinging nameservers
06:17:15 #### ping -c 5 10.1.0.2
06:17:15 PING 10.1.0.2 (10.1.0.2): 56 data bytes
06:17:15 64 bytes from 10.1.0.2: seq=0 ttl=64 time=6.330 ms
06:17:15 64 bytes from 10.1.0.2: seq=1 ttl=64 time=1.443 ms
06:17:15 64 bytes from 10.1.0.2: seq=2 ttl=64 time=0.930 ms
06:17:15 64 bytes from 10.1.0.2: seq=3 ttl=64 time=1.107 ms
06:17:15 64 bytes from 10.1.0.2: seq=4 ttl=64 time=1.034 ms
06:17:15
06:17:15 --- 10.1.0.2 ping statistics ---
06:17:15 5 packets transmitted, 5 packets received, 0% packet loss
06:17:15 round-trip min/avg/max = 0.930/2.168/6.330 ms

Any ideas on how to troubleshoot this? what am I missing?
Any help would be appreciated.

http://openstack-ci-logs.aws.infinidat.com/77/396477/7/check/dsvm-tempest-infinibox-fc/665df78/console.html
http://openstack-ci-logs.aws.infinidat.com/77/396477/7/check/dsvm-tempest-infinibox-fc/665df78/logs/

Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

From tempest log I can see that tempest is not able to connect to the IP of your VM at all:
http://openstack-ci-logs.aws.infinidat.com/77/396477/7/check/dsvm-tempest-infinibox-fc/665df78/logs/tempest.txt.gz#_2016-11-29_06_17_14_767

From neutron logs it looks as if something is wrong with you floating IP setup: http://openstack-ci-logs.aws.infinidat.com/77/396477/7/check/dsvm-tempest-infinibox-fc/665df78/logs/screen-q-l3.txt.gz?level=ERROR#_2016-11-29_05_12_26_073.

I would ask the neutron folks for more advice. Marking this as incomplete for now.

Changed in tempest:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tempest because there has been no activity for 60 days.]

Changed in tempest:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.