2014-09-16 19:17:40 |
Jaroslav Henner |
description |
Quite often, the tempest ssh tests fails, probably because of some problem with loading user-data served on the network. My theory is that the ssh is started after cloud-init in cirros, and this causes connection reset.
I have found that many logs in the gat exhibits similar message as ours, but many of them are in PASSED. I think not many tests are relying on ssh, but many are starting VMs, so we have high SUCCESS ratio. The querry:
http://logstash.openstack.org/index.html#eyJzZWFyY2giOiJtZXNzYWdlOlwiZmFpbGVkIHRvIGdldCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0LzIwMDktMDQtMDQvdXNlci1kYXRhXCIiLCJmaWVsZHMiOlsiYnVpbGRfc3RhdHVzIl0sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNDMyMDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIiwic3RhbXAiOjE0MTA4OTQzOTQxMDB9
Log from the test attached. |
Quite often, the tempest ssh tests fails, probably because of some problem with loading user-data served on the network. My theory is that the ssh is started after cloud-init in cirros, and this causes connection reset.
I have found that many logs in the gat exhibits similar message as ours, but many of them are in PASSED. I think not many tests are relying on ssh, but many are starting VMs, so we have high SUCCESS ratio. The querry:
http://logstash.openstack.org/index.html#eyJzZWFyY2giOiJtZXNzYWdlOlwiZmFpbGVkIHRvIGdldCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0LzIwMDktMDQtMDQvdXNlci1kYXRhXCIiLCJmaWVsZHMiOlsiYnVpbGRfc3RhdHVzIl0sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNDMyMDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIiwic3RhbXAiOjE0MTA4OTQzOTQxMDB9
From the logs, it seems the instance-id seemed to be available, which makes me think this is some race in neutron and/or nova, but I can't really tell
The most interesting part of the logs (console-log):
----------8<----------8<----------
udhcpc (v1.20.1) started\nSending discover...\nSending select for 10.0.0.2...\nLease of 10.0.0.2 obtained, lease time 86400\ndeleting routers\nadding dns 10.0.0.3\ncirros-ds \'net\' up at 7.19\nchecking http://169.254.169.254/2009-04-04/instance-id\nsuccessful after 1/20 tries: up 7.47. iid=i-00000095\nfailed to get http://169.254.169.254/2009-04-04/user-data\nwarning: no ec2 metadata for user-data\nfound datasource (ec2, net)\ncirros-apply-net already run per instance\ncheck-version already run per instance\nStarting dropbear sshd: generating dsa key... OK\n/run/cirros/datasource/data/user-data was not \'#!\' or executable\n=== system information ===\nPlatform: Red Hat OpenStack Compute\nContainer: none\nArch: x86_64\nCPU(s): 1 @ 2399.950 MHz\nCores/Sockets/Threads: 1/1/1\nVirt-type: AMD-V\nRAM Size: 49MB\nDisks:\nNAME MAJ:MIN SIZE LABEL MOUNTPOINT\nvda 253:0 25165824 cirros-rootfs /\n=== sshd host keys ===\n-----BEGIN SSH HOST KEY KEYS-----\nExited:
----------8<----------8<---------- |
|