MAAS disconnect treated as fatal during bootstrap
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Eric Claude Jones |
Bug Description
juju may not wait long enough for the bootstrap node to become 'deployed'.
https:/
19:53:41 INFO juju.environs.sync sync.go:333 using official agent binary 2.3.5-xenial-amd64 (28053kB)
19:53:42 INFO cmd bootstrap.go:389 Starting new instance for initial controller
19:53:42 INFO cmd bootstrap.go:151 Launching controller instance(s) on foundations-maas...
19:53:43 DEBUG juju.cloudconfi
19:53:44 DEBUG juju.service discovery.go:63 discovered init system "systemd" from series "xenial"
19:53:44 DEBUG juju.provider.maas environ.go:1019 maas user data; 3832 bytes
19:53:46 DEBUG juju.provider.maas environ.go:1051 started instance "4cfdbq"
19:53:46 INFO cmd bootstrap.go:225 - 4cfdbq (arch=amd64 mem=32G cores=8)
19:53:46 INFO juju.environs.
19:53:46 INFO juju.environs.
19:53:46 INFO cmd bootstrap.go:425 Installing Juju agent on bootstrap instance
19:53:47 INFO cmd bootstrap.go:517 Fetching Juju GUI 2.12.1
19:57:17 ERROR juju.cmd.
19:57:17 DEBUG juju.cmd.
maas log:
Mar 5 19:53:45 swoobat maas.power: [info] Changing power state (on) of node: juju-3 (4cfdbq)
Mar 5 19:53:50 swoobat maas.power: [info] Changed power state (on) of node: juju-3 (4cfdbq)
Mar 5 19:54:02 swoobat maas.interface: [info] eno1 (physical) on swoobat: New MAC, IP binding observed: 52:54:00:6f:5d:37, 10.244.40.219
Mar 5 19:55:09 swoobat maas.rpc.
Mar 5 19:58:12 swoobat maas.node: [info] juju-3: Status transition from DEPLOYING to DEPLOYED
Changed in juju: | |
importance: | Undecided → High |
status: | New → Incomplete |
summary: |
- juju bootstrap not seeing node in deployed state + MAAS disconnect treated as fatal during bootstrap |
Changed in juju: | |
status: | New → Triaged |
Changed in juju: | |
assignee: | nobody → Eric Claude Jones (ecjones) |
Changed in juju: | |
status: | Triaged → In Progress |
Changed in juju: | |
milestone: | none → 2.3.5 |
Changed in juju: | |
status: | Fix Committed → Fix Released |
Juju should have a default bootstrap timeout of 1200 seconds (AFAICT). That timeout= SECONDS
doesn't seem to match what you're seeing here, but I know it can be
configured with
juju bootstrap --config bootstrap-
Is it possible that a value was being passed. What is the total time from
the start of the process?
Actually, looking at the Juju log, it was something else:
19:57:17 ERROR juju.cmd. juju.commands bootstrap.go:528 failed to bootstrap 10.244. 40.33/MAAS/ api/ ?agent_ name=10c266c3- 9978-43a4- 8a64-b9dea0e790 f5&id=4cfdbq: 10.244. 40.33/>: i/o timeout
model: bootstrap instance started but did not change to Deployed state:
getting instance "4cfdbq": unexpected: Get http://
2.0/machines/
dial tcp 10.244.40.33:80 <http://
^- That says that while we were polling MAAS to see if the machine was
ready, MAAS actually refused our connection request. So we failed to
contact maas to get updated status information on the node that we were
deploying.
Do you have any idea why we would be getting a TCP CONNECT failure talking
to MAAS during this time?
On Mon, Mar 5, 2018 at 11:49 PM, Ashley Lai <email address hidden>
wrote:
> ** Attachment added: "infra-logs.tar" /bugs.launchpad .net/juju/ +bug/1753595/ +attachment/ +files/ infra-logs. tar /bugs.launchpad .net/bugs/ 1753595 /bugs.launchpad .net/juju/ +bug/1753595/ +subscriptions
> https:/
> 5070257/
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https:/
>
> Title:
> juju bootstrap not seeing node in deployed state
>
> To manage notifications about this bug go to:
> https:/
>