Activity log for bug #1370209

Date Who What changed Old value New value Message
2014-09-16 19:10:31 Jaroslav Henner bug added bug
2014-09-16 19:10:31 Jaroslav Henner attachment added log https://bugs.launchpad.net/bugs/1370209/+attachment/4205774/+files/log
2014-09-16 19:17:40 Jaroslav Henner description Quite often, the tempest ssh tests fails, probably because of some problem with loading user-data served on the network. My theory is that the ssh is started after cloud-init in cirros, and this causes connection reset. I have found that many logs in the gat exhibits similar message as ours, but many of them are in PASSED. I think not many tests are relying on ssh, but many are starting VMs, so we have high SUCCESS ratio. The querry: http://logstash.openstack.org/index.html#eyJzZWFyY2giOiJtZXNzYWdlOlwiZmFpbGVkIHRvIGdldCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0LzIwMDktMDQtMDQvdXNlci1kYXRhXCIiLCJmaWVsZHMiOlsiYnVpbGRfc3RhdHVzIl0sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNDMyMDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIiwic3RhbXAiOjE0MTA4OTQzOTQxMDB9 Log from the test attached. Quite often, the tempest ssh tests fails, probably because of some problem with loading user-data served on the network. My theory is that the ssh is started after cloud-init in cirros, and this causes connection reset. I have found that many logs in the gat exhibits similar message as ours, but many of them are in PASSED. I think not many tests are relying on ssh, but many are starting VMs, so we have high SUCCESS ratio. The querry: http://logstash.openstack.org/index.html#eyJzZWFyY2giOiJtZXNzYWdlOlwiZmFpbGVkIHRvIGdldCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0LzIwMDktMDQtMDQvdXNlci1kYXRhXCIiLCJmaWVsZHMiOlsiYnVpbGRfc3RhdHVzIl0sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNDMyMDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIiwic3RhbXAiOjE0MTA4OTQzOTQxMDB9 From the logs, it seems the instance-id seemed to be available, which makes me think this is some race in neutron and/or nova, but I can't really tell The most interesting part of the logs (console-log): ----------8<----------8<---------- udhcpc (v1.20.1) started\nSending discover...\nSending select for 10.0.0.2...\nLease of 10.0.0.2 obtained, lease time 86400\ndeleting routers\nadding dns 10.0.0.3\ncirros-ds \'net\' up at 7.19\nchecking http://169.254.169.254/2009-04-04/instance-id\nsuccessful after 1/20 tries: up 7.47. iid=i-00000095\nfailed to get http://169.254.169.254/2009-04-04/user-data\nwarning: no ec2 metadata for user-data\nfound datasource (ec2, net)\ncirros-apply-net already run per instance\ncheck-version already run per instance\nStarting dropbear sshd: generating dsa key... OK\n/run/cirros/datasource/data/user-data was not \'#!\' or executable\n=== system information ===\nPlatform: Red Hat OpenStack Compute\nContainer: none\nArch: x86_64\nCPU(s): 1 @ 2399.950 MHz\nCores/Sockets/Threads: 1/1/1\nVirt-type: AMD-V\nRAM Size: 49MB\nDisks:\nNAME MAJ:MIN SIZE LABEL MOUNTPOINT\nvda 253:0 25165824 cirros-rootfs /\n=== sshd host keys ===\n-----BEGIN SSH HOST KEY KEYS-----\nExited: ----------8<----------8<----------
2014-09-17 16:38:36 Jaroslav Henner summary failed to load user-data ssh connection refused after vm stop/start
2014-10-03 00:52:33 Eugene Nikanorov marked as duplicate 1323658