juju bootstrap does not wait for MAAS nodes to change state to "deployed"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Currently when bootstrapping a controller, juju only waits until MAAS provides it the IP address - it then enters a loop waiting for connectivity to that host. It does not wait until MAAS marks the host as 'Deployed'.
This can be a problem if for some reason the installation fails, and an old installation is booted. In this case the machine may have the same IP, and even have the juju ssh keys configured. That would result in an old machine with other existing data on it being used as a controller.
This arose for me in testing because maas-dhcpd was down, so my MAAS machine tries to PXE boot, failed, then booted the old installation from the HDD.
I did not verify if this also applies to machines during normal deployment (outside of bootstrap) - we should also ensure it doesn't happen there.
Changed in juju: | |
status: | New → Triaged |
importance: | Undecided → Medium |
I'm able to replicate the issue on 2.6.5-bionic-amd64. Bootstrap hangs at "Running machine configuration script..." until I hit Ctrl-C:
$ juju bootstrap node-amontons
Creating Juju controller "node-amontons" on node-amontons
Looking for packaged Juju agent version 2.6.5 for amd64
Launching controller instance(s) on node-amontons...
- srt88n (arch=amd64 mem=3.5G cores=1)
Installing Juju agent on bootstrap instance
Fetching Juju GUI 2.14.0
Waiting for address
Attempting to connect to 172.16.99.2:22
Connected to 172.16.99.2
Running machine configuration script...
^CInterrupt signalled: waiting for bootstrap to exit
Bootstrap agent now started
Contacting Juju controller at 172.16.99.2 to verify accessibility...
ERROR unable to contact api server after 1 attempts: unable to connect to API: dial tcp 172.16.99.2:17070: connect: connection refused
MAAS version: 2.6.0 (7802-g59416a86 9-0ubuntu1~ 18.04.1)