instance is not reachable via ssh

Bug #565018 reported by Scott Moser on 2010-04-16
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
High
Scott Moser
Lucid
High
Scott Moser

Bug Description

Binary package hint: cloud-init

On occasion we have seen failure of an instance's cloud-init process to successfully get ssh set up.

The symptom is that the user cannot log in, and ssh keys are not written to the console.

I'll attach two failed console logs, that show the problem.

Update:Checking logs, I see that I did not get access denied, but:
Read from socket failed: Connection reset by peer

Scott Moser (smoser) wrote :
Scott Moser (smoser) wrote :
Scott Moser (smoser) wrote :
summary: - instance is not reachable via ssh (access denied)
+ instance is not reachable via ssh
Scott Moser (smoser) on 2010-04-16
description: updated
Scott Moser (smoser) wrote :

The output here is from an image of 20100416 with some additional print statements (with flush) into cloud-init-cfg and CloudConfig.py. as soon as cloud-init-cfg comes up it prints a message saying so.

This instance had ssh denying connections. Looking at the log shows that no 'cloud-config-ssh' was ran.

I'm running again with a 'info' debug on upstart, hoping to verify that the cloud-config-ssh job was never started.
That job looks like this:

# cloud-config-ssh - obtain ssh keys from metadata service
description "Download preconfigured ssh keys"
start on filesystem
console output
task
exec cloud-init-cfg config-ssh

Scott Moser (smoser) wrote :

I've run 200 instances of my debug version of lucid-server-20100416 (x86_64). I've not been able to catch this error with upstart debug enabled.

Scott Moser (smoser) wrote :

I've just tested, and failure message that the ssh client gives if the ssh server has no keys is just simply:
Connection closed by 204.236.215.149

Also, at that point ssh-keyscan will show:
ssh-keyscan ec2-204-236-215-149.compute-1.amazonaws.com
# ec2-204-236-215-149.compute-1.amazonaws.com SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu3
Connection closed by 204.236.215.149

Scott Moser (smoser) wrote :

I just ran test on uec that carlos was seeing this on, 200 instances using mathiaz's lp:~mathiaz/+junk/uec-testing-scripts.

the end result shows 28 failures 25 of them had the following in their console log:
 | waiting for md at http://169.254.169.254/2009-04-04/meta-data/instance-id (try 19/20).

This shows up a great deal more on the data center uec than it does elsewhere (Its occurs for me < 3% of the time, maybe even less than 1%).

Thierry Carrez (ttx) on 2010-04-21
Changed in cloud-init (Ubuntu Lucid):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Scott Moser (smoser)
Scott Moser (smoser) wrote :

I enabled 'message' level output from upstart in an instance and caught this failure. This has the cloud-init package attached to bug 566792 installed with upstart debug set to 'message' and mountall debug turned on. The console log here was attached 40 seconds after boot (normally that would be easily enough time).

Note, as in the other attachment, it doesn't appear that the 'cloud-config-cfg ssh' (invoked from cloud-config-ssh.conf) job is being run at all.

Other information I did get was that ssh gave the following error message:
Read from socket failed: Connection reset by peer

and ssh-keyscan showed:
# 192.168.1.193 SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu3
Read from socket failed: Connection reset by peer

tags: added: iso-testing
Scott Moser (smoser) wrote :

Closing as "Wont fix". This is most likely really a Eucalyptus bug, and one that is not reproducible any more. If we get reproducible system, we can re-address.

Changed in cloud-init (Ubuntu):
status: Confirmed → Won't Fix
Changed in cloud-init (Ubuntu Lucid):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers