juju bootstrap on armhf/keystone hangs juju version 1.24.5

Bug #1496184 reported by Michael Reed
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Invalid
High
Unassigned
1.24
Invalid
High
Unassigned
1.25
Invalid
High
Unassigned

Bug Description

This is somewhat of continuation of https://bugs.launchpad.net/juju-core/+bug/1415517. As this issue still persists. I can bootstrap an armhf system inconsistantly. I have been able to bootstrap sometimes once, and at other times twice in a row but then it typically hangs again. Oddly enough the status in maas is that the system is in a "deployed" state.

botarmhf@qa:~$ juju bootstrap --show-log --constraints tags=slayton
2015-09-15 19:34:24 INFO juju.cmd supercommand.go:37 running juju [1.24.5-trusty-amd64 gc]
2015-09-15 19:34:24 INFO juju.network network.go:194 setting prefer-ipv6 to false
Bootstrapping environment "maas"
2015-09-15 19:34:24 INFO juju.environs.tools tools.go:86 reading tools with major.minor version 1.24
Starting new instance for initial state server
2015-09-15 19:34:27 INFO juju.provider.maas environ.go:130 address allocation feature disabled; using "juju-br0" bridge for all containers
Launching instance
2015-09-15 19:34:27 WARNING juju.provider.maas environ.go:706 no architecture was specified, acquiring an arbitrary node
2015-09-15 19:34:32 INFO juju.provider.maas environ.go:1088 could not acquire a node in zone "avaton", trying another zone
2015-09-15 19:34:32 WARNING juju.provider.maas environ.go:706 no architecture was specified, acquiring an arbitrary node
2015-09-15 19:34:37 INFO juju.provider.maas environ.go:1088 could not acquire a node in zone "default", trying another zone
2015-09-15 19:34:37 WARNING juju.provider.maas environ.go:706 no architecture was specified, acquiring an arbitrary node
2015-09-15 19:34:42 INFO juju.provider.maas environ.go:1088 could not acquire a node in zone "mcdivitt", trying another zone
2015-09-15 19:34:42 WARNING juju.provider.maas environ.go:706 no architecture was specified, acquiring an arbitrary node
 - /MAAS/api/1.0/nodes/node-9ae91a7e-5bbe-11e5-b69c-00163ec335e8/
2015-09-15 19:44:13 INFO juju.environs.bootstrap bootstrap.go:184 newest version: 1.24.5
2015-09-15 19:44:13 INFO juju.environs.bootstrap bootstrap.go:212 picked bootstrap tools version: 1.24.5
Installing Juju agent on bootstrap instance
Waiting for address
Attempting to connect to ms10-18n4-slayton.1ss:22
Attempting to connect to ms10-18n4-slayton.1ss:22
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
7b:6e:38:33:14:de:57:cd:24:3b:c1:d4:9c:ac:9c:76.
Please contact your system administrator.
Add correct host key in /home/botarmhf/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/botarmhf/.ssh/known_hosts:1
  remove with: ssh-keygen -f "/home/botarmhf/.ssh/known_hosts" -R ms10-18n4-slayton.1ss
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
Logging to /var/log/cloud-init-output.log on remote host
Running apt-get update
Running apt-get upgrade
Installing package: curl
Installing package: cpu-checker
Installing package: bridge-utils
Installing package: rsyslog-gnutls
Installing package: cloud-utils
Installing package: cloud-image-utils
Installing package: tmux
Fetching tools: curl -sSfw 'tools from %{url_effective} downloaded: HTTP %{http_code}; time %{time_total}s; size %{size_download} bytes; speed %{speed_download} bytes/s ' --retry 10 -o $bin/tools.tar.gz <[https://streams.canonical.com/juju/tools/releases/juju-1.24.5-trusty-armhf.tgz]>
Bootstrapping Juju machine agent
Starting Juju machine agent (jujud-machine-0)
Bootstrap agent installed
2015-09-15 19:46:48 ERROR juju.cmd supercommand.go:430 saving bootstrap endpoint address: failed to get connection info: environment "maas" not found

Curtis Hovey (sinzui)
tags: added: armhf bootstrap
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.26-alpha1
Changed in juju-core:
milestone: 1.26-alpha1 → 1.24.7
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.24.7 → 1.24.8
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.24.8 → 1.26-alpha1
Revision history for this message
Curtis Hovey (sinzui) wrote :

bootstrapping is inconsistent. It sometimes works.

tags: added: bug-squad
Revision history for this message
Ian Booth (wallyworld) wrote :

To help us diagnose the issue, could we please get logs from a bootstrap done with debug turned on, and include the machine logs on the server as well as the client cli output.

Revision history for this message
Michael Reed (mreed8855) wrote :

Here are the logs

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.26-alpha1 → 1.26-alpha2
Revision history for this message
Cheryl Jennings (cherylj) wrote :

I see this as the reason bootstrap failed:

2015-10-29 14:47:26 ERROR juju.cmd supercommand.go:430 failed to bootstrap environment: bootstrap instance started but did not change to Deployed state: instance "/MAAS/api/1.0/nodes/node-9ae91a7e-5bbe-11e5-b69c-00163ec335e8/" is started but not deployed

You can try to change the timeout to be longer (the default is 10 minutes) by setting bootstrap-timeout in your environments.yaml. You specify the time in seconds.

If you're still seeing these failures with a longer timeout, we should engage the MAAS team to determine why the node doesn't move to "Deployed". Please keep using --debug when bootstrapping, in case we need to examine the output again. Thanks! :)

Changed in juju-core:
status: Triaged → Incomplete
Revision history for this message
Michael Reed (mreed8855) wrote :

I have increased the bootstrap-timeout to 30 minutes and it still fails to bootstrap, however in looking at the state of the system on the Maas UI it appears that the system did actually move to the deployed state. We may need to engage the Maas team.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Michael - was the failure message the same from juju - "bootstrap instance started but did not change to Deployed state"? Did the node move to Deployed after juju gave up? or before?

Changed in juju-core:
milestone: 1.26-alpha2 → 1.26-beta1
Changed in juju-core:
milestone: 1.26-beta1 → 2.0-alpha2
Revision history for this message
Sean Feole (sfeole) wrote :

Marking this bug as invalid as this bug is no longer an issue in the latest ppa:juju/stable drop of 1.25

Changed in juju-core:
status: Incomplete → Invalid
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-alpha2 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.