insufficient tolerance to ssh connection timeouts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
maas-deployer |
Fix Released
|
High
|
Edward Hope-Morley |
Bug Description
There is a race condition of sorts between the moment the deployer is able too ssh into the controller and the time it takes for the MAAS api to be up.
2015-08-20 18:14:53,705 DEBUG Waiting for MAAS vm to start.
2015-08-20 18:14:54,706 DEBUG Executing: 'ssh -i /home/ubuntu/
2015-08-20 18:14:55,554 DEBUG MAAS vm started.
2015-08-20 18:14:55,554 DEBUG Logging into maas host '192.168.122.2'
ssh: connect to host 192.168.122.2 port 22: Connection refused
2015-08-20 18:14:55,563 DEBUG Fetching MAAS api key
2015-08-20 18:14:55,563 DEBUG Executing: 'ssh -i /home/ubuntu/
ssh: connect to host 192.168.122.2 port 22: Connection refused
2015-08-20 18:14:55,568 DEBUG Command failed - retrying in 1s
2015-08-20 18:14:56,569 DEBUG Fetching MAAS api key
2015-08-20 18:14:56,569 DEBUG Executing: 'ssh -i /home/ubuntu/
Warning: Permanently added '192.168.122.2' (ECDSA) to the list of known hosts.
sudo: maas-region-admin: command not found
Related branches
- Ante Karamatić (community): Approve
-
Diff: 106 lines (+37/-40)1 file modifiedmaas_deployer/vmaas/engine.py (+37/-40)
Changed in maas-deployer: | |
status: | New → In Progress |
Changed in maas-deployer: | |
status: | In Progress → Fix Committed |
Changed in maas-deployer: | |
status: | Fix Committed → Fix Released |
Actually looks like the issue is before the api call on a subsequent ssh attempt. I think we need to make all ssh connections tolerant to timeouts and allow a certain number of retries.