VM's cannot have ssh key injected because can't access metadata service
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
The periodic tempest job is failing in the following tests:
tempest.
tempest.
tempest.
tempest.
tempest.
tempest.
tempest.
The reason is that the VM's cannot reach the service metadata at 169.254.169.254 and so don't get the ssh key injected, which is used to access the VM, as you can see in the nova console-log below
$ nova console-log edd925b0-
info: initramfs: up at 3.26
NOCHANGE: partition 1 is size 64260. it cannot be grown
info: initramfs loading root from /dev/vda1
info: /etc/init.
info: container: none
Starting logging: OK
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
WARN: /etc/rc3.
Initializing random number generator... done.
Starting acpid: OK
cirros-ds 'local' up at 6.78
no results found for mode=local. up 7.35. searched: nocloud configdrive ec2
Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending select for 10.100.0.5...
Lease of 10.100.0.5 obtained, lease time 86400
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.100.0.1"
cirros-ds 'net' up at 8.87
checking http://
failed 1/20: up 9.12. request failed
failed 2/20: up 12.53. request failed
failed 3/20: up 15.38. request failed
failed 4/20: up 18.20. request failed
failed 5/20: up 20.95. request failed
failed 6/20: up 24.38. request failed
failed 7/20: up 27.11. request failed
failed 8/20: up 29.96. request failed
failed 9/20: up 32.69. request failed
failed 10/20: up 35.98. request failed
failed 11/20: up 38.53. request failed
failed 12/20: up 41.74. request failed
failed 13/20: up 44.32. request failed
failed 14/20: up 47.37. request failed
failed 15/20: up 49.91. request failed
failed 16/20: up 53.04. request failed
failed 17/20: up 55.78. request failed
failed 18/20: up 58.82. request failed
The tests consist in create a vm using a keypair and try to access the vm with that keypair. If you take a look in the test traceback the test can reach the vm, but cannot access with an authentication error
[stack@undercloud tempest]$ python -m testtools.run tempest.
Tests running...
=======
ERROR: tempest.
-------
...
2017-07-04 07:56:55,117 10972 INFO [tempest.
2017-07-04 07:57:01,134 10972 WARNING [tempest.
2017-07-04 07:57:04,140 10972 WARNING [tempest.
2017-07-04 07:57:07,670 10972 WARNING [tempest.
2017-07-04 07:57:12,183 10972 WARNING [tempest.
2017-07-04 07:57:17,696 10972 WARNING [tempest.
2017-07-04 07:57:24,210 10972 WARNING [tempest.
2017-07-04 07:57:31,721 10972 WARNING [tempest.
2017-07-04 07:57:40,238 10972 WARNING [tempest.
2017-07-04 07:57:49,752 10972 WARNING [tempest.
2017-07-04 07:58:00,267 10972 WARNING [tempest.
2017-07-04 07:58:11,839 10972 INFO [paramiko.
2017-07-04 07:58:12,019 10972 INFO [paramiko.
2017-07-04 07:58:12,146 10972 WARNING [tempest.
2017-07-04 07:58:24,670 10972 INFO [paramiko.
2017-07-04 07:58:24,815 10972 INFO [paramiko.
2017-07-04 07:58:24,938 10972 WARNING [tempest.
2017-07-04 07:58:38,463 10972 INFO [paramiko.
2017-07-04 07:58:38,598 10972 INFO [paramiko.
2017-07-04 07:58:38,719 10972 WARNING [tempest.
2017-07-04 07:58:53,251 10972 INFO [paramiko.
2017-07-04 07:58:53,381 10972 INFO [paramiko.
2017-07-04 07:58:53,503 10972 WARNING [tempest.
2017-07-04 07:59:09,032 10972 INFO [paramiko.
2017-07-04 07:59:09,134 10972 INFO [paramiko.
2017-07-04 07:59:09,256 10972 WARNING [tempest.
2017-07-04 07:59:25,784 10972 INFO [paramiko.
2017-07-04 07:59:25,914 10972 INFO [paramiko.
2017-07-04 07:59:26,036 10972 WARNING [tempest.
2017-07-04 07:59:43,571 10972 INFO [paramiko.
2017-07-04 07:59:43,700 10972 INFO [paramiko.
2017-07-04 07:59:43,821 10972 WARNING [tempest.
2017-07-04 08:00:02,358 10972 INFO [paramiko.
2017-07-04 08:00:02,489 10972 INFO [paramiko.
2017-07-04 08:00:02,612 10972 WARNING [tempest.
2017-07-04 08:00:22,151 10972 INFO [paramiko.
2017-07-04 08:00:22,283 10972 INFO [paramiko.
2017-07-04 08:00:22,404 10972 WARNING [tempest.
2017-07-04 08:00:42,959 10972 INFO [paramiko.
2017-07-04 08:00:43,103 10972 INFO [paramiko.
2017-07-04 08:00:43,220 10972 WARNING [tempest.
2017-07-04 08:01:04,753 10972 INFO [paramiko.
2017-07-04 08:01:04,894 10972 INFO [paramiko.
2017-07-04 08:01:05,016 10972 WARNING [tempest.
2017-07-04 08:01:27,555 10972 INFO [paramiko.
2017-07-04 08:01:27,695 10972 INFO [paramiko.
2017-07-04 08:01:27,815 10972 WARNING [tempest.
2017-07-04 08:01:51,354 10972 INFO [paramiko.
2017-07-04 08:01:51,484 10972 INFO [paramiko.
2017-07-04 08:01:51,605 10972 WARNING [tempest.
2017-07-04 08:02:16,149 10972 INFO [paramiko.
2017-07-04 08:02:16,278 10972 INFO [paramiko.
2017-07-04 08:02:16,401 10972 ERROR [tempest.
...
Traceback (most recent call last):
File "/usr/lib/
return f(self, *func_args, **func_kwargs)
File "/usr/lib/
self.
File "/usr/lib/
sshv4_1, ips_from_api_1, sid1 = self.prepare_
File "/usr/lib/
username=
File "/usr/lib/
linux_
File "/usr/lib/
six.
File "/usr/lib/
return function(self, *args, **kwargs)
File "/usr/lib/
self.
File "/usr/lib/
connection = self._get_
File "/usr/lib/
password=
tempest.
User: cirros, Password: None
It's possible to access the vm through username cirros and password cubswin:), but the test was designed to access the vm through ssh keys.
(overcloud) [stack@undercloud ~]$ nova list --all-tenant
+------
| ID | Name | Tenant ID | Status | Task State | Power State | Networks |
+------
| edd925b0-
+------
(overcloud) [stack@undercloud ~]$ ssh cirros@
The authenticity of host '192.168.24.153 (192.168.24.153)' can't be established.
RSA key fingerprint is 3f:1c:28:
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.24.153' (RSA) to the list of known hosts.
cirros@
Permission denied, please try again.
cirros@
$ ls
$ ls -lah
total 5
drwxr-xr-x 2 cirros cirros 1.0K Jul 4 09:00 .
drwxrwxr-x 4 root root 1.0K May 7 2015 ..
-rw------- 1 cirros cirros 11 Jul 4 09:00 .ash_history
-rwxr-xr-x 1 cirros cirros 43 May 7 2015 .profile
-rwxr-xr-x 1 cirros cirros 66 May 7 2015 .shrc
Notice that one of the tests failing is the test_minimum_
Here's the route from the undercloud, and the controller in overcloud
[stack@undercloud tempest]$ route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default gateway 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.23.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.24.0 0.0.0.0 255.255.255.0 U 0 0 0 br-ctlplane
[heat-admin@
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default gateway 0.0.0.0 UG 0 0 0 br-ex
169.254.169.254 gateway 255.255.255.255 UGH 0 0 0 br-ex
192.168.24.0 0.0.0.0 255.255.255.0 U 0 0 0 br-ex
description: | updated |
Changed in tripleo: | |
importance: | Medium → High |
milestone: | none → pike-3 |
status: | New → Triaged |
tags: | added: alert ci promotion-blocker |
Changed in tripleo: | |
importance: | High → Critical |
I'm no longer seeing the tests failing in today periodic job results
http:// logs.openstack. org/periodic/ periodic- tripleo- ci-centos- 7-ovb-nonha- tempest- oooq-master/ 154fe22/ logs/oooq/ undercloud/ home/jenkins/ tempest/ tempest. html.gz