While reproducing this, there is still one issue that is quite important. Namely, we don't report any errors to the user while this is happening.
Specifically, I went to reproduce this bug, tweaked my config, forgot about it when I came back 1 week later, and the issue is that we report *0* warnings or failures except at debug level.
So `juju bootstrap lxd lxd` under this situation just sits there and fails, and after 20 minutes comes back with 'failed to bootstrap', and kills your instances. There is also nothing in cloud-init-output.log (because the failure is client side.)
I think at a minimum we should be trying to surface this error:
```
09:12:41 DEBUG juju.provider.common bootstrap.go:669 connection attempt for 10.8.158.125 failed: /home/jameinel/.ssh/config: line 3: Bad configuration option: pubkeyacceptedalgorithms
/home/jameinel/.ssh/config: terminating, 1 bad configuration options
```
I know that the reason we don't surface SSH errors by default is because we *expect* that the controller won't be up immediately, and so we don't want to scare users by saying that we failed to connect.
But we need something that can take a "I think I can retry this error, but I have been retrying it for 1 minute, I should surface something".
Note that the error that we explicitly want to supress are these:
09:18:46 DEBUG juju.provider.common bootstrap.go:669 connection attempt for 10.8.158.251 failed: ssh: connect to host 10.8.158.251 port 22: Connection refused
09:18:52 DEBUG juju.provider.common bootstrap.go:669 connection attempt for 10.8.158.251 failed: /var/lib/juju/nonce.txt does not exist
Those are both cases where the machine hasn't finished initializing, and it is a race condition between the client trying to connect and the machine not being done with cloud-init.
But "terminating, 1 bad configuration options" is a permanent failure that needs human intervention.
While reproducing this, there is still one issue that is quite important. Namely, we don't report any errors to the user while this is happening.
Specifically, I went to reproduce this bug, tweaked my config, forgot about it when I came back 1 week later, and the issue is that we report *0* warnings or failures except at debug level.
So `juju bootstrap lxd lxd` under this situation just sits there and fails, and after 20 minutes comes back with 'failed to bootstrap', and kills your instances. There is also nothing in cloud-init- output. log (because the failure is client side.)
I think at a minimum we should be trying to surface this error: common bootstrap.go:669 connection attempt for 10.8.158.125 failed: /home/jameinel/ .ssh/config: line 3: Bad configuration option: pubkeyaccepteda lgorithms .ssh/config: terminating, 1 bad configuration options
```
09:12:41 DEBUG juju.provider.
/home/jameinel/
```
I know that the reason we don't surface SSH errors by default is because we *expect* that the controller won't be up immediately, and so we don't want to scare users by saying that we failed to connect.
But we need something that can take a "I think I can retry this error, but I have been retrying it for 1 minute, I should surface something".
Note that the error that we explicitly want to supress are these: common bootstrap.go:669 connection attempt for 10.8.158.251 failed: ssh: connect to host 10.8.158.251 port 22: Connection refused
09:18:46 DEBUG juju.provider.
09:18:52 DEBUG juju.provider. common bootstrap.go:669 connection attempt for 10.8.158.251 failed: /var/lib/ juju/nonce. txt does not exist
Those are both cases where the machine hasn't finished initializing, and it is a race condition between the client trying to connect and the machine not being done with cloud-init.
But "terminating, 1 bad configuration options" is a permanent failure that needs human intervention.