juju bootstrap fails with MAAS trunk and juju (beta12 behind a proxy too)

Bug #1607749 reported by Brendan Donegan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
In Progress
Critical
Dimiter Naydenov

Bug Description

We run tests in the MAAS CI lab for using Juju 2 with MAAS 2 and since Juju 2 has been updated to Beta 12, we have been observing the following failure on juju bootstrap:

testtools.testresult.real._StringException: juju --debug bootstrap autopkgtest testmaas --upload-tools --config=/tmp/tmpbwyp0zz3: {{{
2016-07-29 11:08:28 INFO juju.cmd supercommand.go:63 running juju [2.0-beta12 gc go1.6.2]
2016-07-29 11:08:29 INFO cmd cmd.go:141 Adding contents of "/root/.local/share/juju/ssh/juju_id_rsa.pub" to authorized-keys
2016-07-29 11:08:29 INFO cmd cmd.go:141 Adding contents of "/root/.ssh/id_rsa.pub" to authorized-keys
2016-07-29 11:08:29 DEBUG juju.cmd.juju.commands bootstrap.go:471 preparing controller with config: map[type:maas name:controller uuid:f27b66a4-2def-4897-8efa-785b1b8a8236 https-proxy:http://162.213.35.104:3128 http-proxy:http://162.213.35.104:3128 authorized-keys:ssh-rsa <redacted> juju-client-key
ssh-rsa <redacted> root@autopkgtest
]
2016-07-29 11:08:29 DEBUG juju.provider.maas environprovider.go:32 opening model "controller".
2016-07-29 11:08:29 INFO cmd cmd.go:129 Creating Juju controller "autopkgtest" on testmaas
2016-07-29 11:08:29 INFO juju.cmd.juju.commands bootstrap.go:594 combined bootstrap constraints:
2016-07-29 11:08:29 INFO cmd cmd.go:129 Bootstrapping model "controller"
2016-07-29 11:08:29 DEBUG juju.environs.bootstrap bootstrap.go:214 model "controller" supports service/machine networks: true
2016-07-29 11:08:29 DEBUG juju.environs.bootstrap bootstrap.go:216 network management by juju enabled: true
2016-07-29 11:08:29 INFO cmd cmd.go:129 Starting new instance for initial controller
Launching instance
2016-07-29 11:08:29 WARNING juju.provider.maas environ.go:688 no architecture was specified, acquiring an arbitrary node
2016-07-29 11:08:30 DEBUG juju.cloudconfig.instancecfg instancecfg.go:772 Setting numa ctl preference to false
2016-07-29 11:08:30 DEBUG juju.service discovery.go:62 discovered init system "systemd" from series "xenial"
2016-07-29 11:08:30 DEBUG juju.provider.maas environ.go:943 maas user data; 7436 bytes
2016-07-29 11:08:30 DEBUG juju.provider.maas interfaces.go:274 interface "enp3s0" has no address
2016-07-29 11:08:30 DEBUG juju.provider.maas interfaces.go:274 interface "enp2s0" has no address
2016-07-29 11:08:30 DEBUG juju.provider.maas environ.go:976 started instance "yhhche"
2016-07-29 11:08:31 DEBUG juju.provider.common state.go:36 putting "provider-state" to bootstrap storage *maas.maas2Storage
 - yhhche
2016-07-29 11:08:31 INFO juju.environs.bootstrap bootstrap.go:517 newest version: 2.0-beta12.1
2016-07-29 11:08:31 INFO juju.environs.bootstrap bootstrap.go:545 picked bootstrap tools version: 2.0-beta12.1
2016-07-29 11:08:31 INFO cmd cmd.go:129 Building tools to upload (2.0-beta12.1-xenial-amd64)
2016-07-29 11:08:31 DEBUG juju.environs.sync sync.go:303 Building tools
2016-07-29 11:08:31 DEBUG juju.environs.tools build.go:125 looking for: juju
2016-07-29 11:08:31 DEBUG juju.environs.tools build.go:164 checking: /usr/lib/juju-2.0/bin/jujud
2016-07-29 11:08:31 INFO juju.environs.tools build.go:170 found existing jujud
2016-07-29 11:08:31 INFO juju.environs.tools build.go:180 target: /tmp/juju-tools549715228/jujud
2016-07-29 11:08:31 DEBUG juju.environs.tools build.go:235 forcing version to 2.0-beta12.1
2016-07-29 11:08:31 DEBUG juju.environs.tools build.go:41 adding entry: &tar.Header{Name:"FORCE-VERSION", Mode:420, Uid:0, Gid:0, Size:12, ModTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, Typeflag:0x30, Linkname:"", Uname:"ubuntu", Gname:"ubuntu", Devmajor:0, Devminor:0, AccessTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, ChangeTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, Xattrs:map[string]string(nil)}
2016-07-29 11:08:31 DEBUG juju.environs.tools build.go:41 adding entry: &tar.Header{Name:"jujud", Mode:493, Uid:0, Gid:0, Size:61557568, ModTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, Typeflag:0x30, Linkname:"", Uname:"ubuntu", Gname:"ubuntu", Devmajor:0, Devminor:0, AccessTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, ChangeTime:time.Time{sec:63605387311, nsec:756000000, loc:(*time.Location)(0x378bae0)}, Xattrs:map[string]string(nil)}
2016-07-29 11:08:43 INFO juju.environs.sync sync.go:322 built tools 2.0-beta12.1-xenial-amd64 (14384kB)
2016-07-29 11:08:43 INFO cmd cmd.go:129 Installing Juju agent on bootstrap instance
2016-07-29 11:12:58 INFO cmd cmd.go:129 Unable to fetch Juju GUI info: error fetching simplestreams metadata: invalid URL "https://streams.canonical.com/juju/gui/streams/v1/index.sjson" not found
2016-07-29 11:16:30 DEBUG juju.cloudconfig.instancecfg instancecfg.go:772 Setting numa ctl preference to false
Waiting for address
Attempting to connect to 10.245.136.7:22
2016-07-29 11:16:31 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: ssh: connect to host 10.245.136.7 port 22: Connection refused
2016-07-29 11:16:36 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: ssh: connect to host 10.245.136.7 port 22: Connection refused
2016-07-29 11:16:42 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: Warning: Permanently added '10.245.136.7' (ECDSA) to the list of known hosts.
/var/lib/juju/nonce.txt does not exist
2016-07-29 11:16:47 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: /var/lib/juju/nonce.txt does not exist
2016-07-29 11:16:53 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: /var/lib/juju/nonce.txt does not exist
2016-07-29 11:16:58 DEBUG juju.provider.common bootstrap.go:330 connection attempt for 10.245.136.7 failed: /var/lib/juju/nonce.txt does not exist
2016-07-29 11:17:05 DEBUG juju.utils.ssh ssh.go:292 using OpenSSH ssh client
Logging to /var/log/cloud-init-output.log on remote host
Running apt-get update
Running apt-get upgrade
Installing package: curl
Installing package: cpu-checker
Installing package: bridge-utils
Installing package: cloud-utils
Installing package: cloud-image-utils
Installing package: tmux
Bootstrapping Juju machine agent
Starting Juju machine agent (jujud-machine-0)
2016-07-29 11:18:17 INFO cmd cmd.go:129 Bootstrap agent installed
2016-07-29 11:18:17 DEBUG juju.juju api.go:246 API hostnames [10.245.136.7:17070] - resolving hostnames
2016-07-29 11:18:17 INFO juju.juju api.go:268 new API addresses to cache [10.245.136.7:17070]
2016-07-29 11:18:17 INFO juju.juju api.go:77 connecting to API addresses: [10.245.136.7:17070]
2016-07-29 11:18:17 INFO juju.api apiclient.go:520 dialing "wss://10.245.136.7:17070/model/012ca0ac-b306-4315-813b-4aa1cb6d4cbb/api"
2016-07-29 11:18:17 DEBUG juju.api apiclient.go:526 error dialing "wss://10.245.136.7:17070/model/012ca0ac-b306-4315-813b-4aa1cb6d4cbb/api", will retry: websocket.Dial wss://10.245.136.7:17070/model/012ca0ac-b306-4315-813b-4aa1cb6d4cbb/api: dial tcp 10.245.136.7:17070: getsockopt: connection refused
2016-07-29 11:18:19 INFO juju.api apiclient.go:520 dialing "wss://10.245.136.7:17070/model/012ca0ac-b306-4315-813b-4aa1cb6d4cbb/api"
2016-07-29 11:18:19 INFO juju.api apiclient.go:297 connection established to "wss://10.245.136.7:17070/model/012ca0ac-b306-4315-813b-4aa1cb6d4cbb/api"
2016-07-29 11:18:19 DEBUG juju.juju api.go:251 API hostnames unchanged - not resolving
2016-07-29 11:18:20 DEBUG juju.api apiclient.go:546 health ping failed: connection is shut down
2016-07-29 11:18:20 INFO cmd cmd.go:129 Waiting for API to become available: upgrade in progress (upgrade in progress)
(... repeats from 'connecting to API addresses'...)
2016-07-29 11:19:18 INFO juju.provider.common destroy.go:22 destroying model "controller"
2016-07-29 11:19:18 INFO juju.provider.common destroy.go:33 destroying instances
2016-07-29 11:19:18 DEBUG juju.provider.common state.go:36 putting "provider-state" to bootstrap storage *maas.maas2Storage
2016-07-29 11:19:18 INFO juju.provider.common destroy.go:53 destroying storage
2016-07-29 11:19:18 ERROR cmd supercommand.go:458 unable to contact api server after 61 attempts: upgrade in progress (upgrade in progress)
2016-07-29 11:19:18 DEBUG cmd supercommand.go:459 (error details: [{github.com/juju/juju/cmd/juju/common/controller.go:130: unable to contact api server after 61 attempts} {github.com/juju/juju/api/block/client.go:29: } {github.com/juju/juju/api/apiclient.go:584: } {github.com/juju/juju/rpc/client.go:149: } {upgrade in progress (upgrade in progress)}])
}}}

On the MAAS side of things the controller node seems to deploy succesfully but when bootstrap fails it is automatically released.

Revision history for this message
Brendan Donegan (brendan-donegan) wrote :
description: updated
Revision history for this message
Martin Packman (gz) wrote :

Can you bootstrap with --keep-broken then ssh in and collect the contents of /var/log/cloud-init*.log and /var/log/juju/* please?

Changed in juju-core:
status: New → Incomplete
Revision history for this message
Martin Packman (gz) wrote :

See bug 1605335 as well, but given the 'upgrade' reported this is likely that the api server worker never came up in a working state for some reason, which the juju logs should have more details on.

Though, may also want to do post-init debug logging, so, before bootstrap:

export JUJU_LOGGING_CONFIG="<root>=DEBUG"

Revision history for this message
Brendan Donegan (brendan-donegan) wrote :
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :
Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

Logs added. I didn't get to try JUJU_LOGGING_CONFIG but if you need me to, please say. Thanks

Revision history for this message
Martin Packman (gz) wrote :

Okay, there is an upgrade attempted, so this looks like a bad variation on bug 1605050 which is fortunately fixed on master.

From the machine log:

2016-07-29 13:51:48 INFO juju.worker.upgrader upgrader.go:178 desired tool version: 2.0-beta12
2016-07-29 13:51:48 INFO juju.worker.upgrader upgrader.go:199 upgrade requested from 2.0-beta12.1 to 2.0-beta12
...
2016-07-29 13:56:06 INFO juju.environs.tools tools.go:106 finding tools in stream "devel"
2016-07-29 13:56:06 INFO juju.environs.tools tools.go:110 reading tools with major version -1
2016-07-29 13:56:06 INFO juju.environs.tools tools.go:116 filtering tools by version: 2.0-beta12
2016-07-29 13:56:06 INFO juju.environs.tools tools.go:119 filtering tools by series: xenial
2016-07-29 13:56:06 INFO juju.environs.tools tools.go:122 filtering tools by architecture: amd64
...
2016-07-29 14:00:21 DEBUG juju.worker.dependency engine.go:479 "upgrader" manifold worker stopped: no tools available
2016-07-29 14:00:21 ERROR juju.worker.dependency engine.go:526 "upgrader" manifold worker returned unexpected error: no tools available
2016-07-29 14:00:24 DEBUG juju.worker.dependency engine.go:465 "upgrader" manifold worker started

The problem is likely that the test machine doesn't have routing to streams.canonical.com or otherwise can't fetch the real devel tools, so just gets stuck in a loop with the upgrader restarting forever. Could confirm this by trying a bootstrap with beta12 and not using --upload-tools but instead letting the box access streams.canonical.com or mirroring somewhere it can access and setting agent-metadata-url config.

Changed in juju-core:
importance: Undecided → Critical
milestone: none → 2.0-beta14
status: Incomplete → Triaged
tags: added: bootstrap maas-provider
summary: - juju bootstrap fails with MAAS trunk and juju beta12
+ juju bootstrap fails with MAAS trunk and juju beta12 behind a proxy
Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: juju bootstrap fails with MAAS trunk and juju beta12 behind a proxy

Hi Martin,

This is how we bootstrap juju:

juju --debug bootstrap autopkgtest testmaas --keep-broken --config=/tmp/tmp0k2f4hy6

The config file above containers:

http-proxy: {proxy}
https-proxy: {proxy}

Where proxy is the squid proxy that we setup in that environment. So, unless we are doing a wrong config, it seems that Juju is failing to access streams.canonical.com behind a proxy.

Brendan has tested that the deployed machine can access streams.canonical.com via the proxy:

16:36 < brendand> roaksoax, i can wget from streams.canonical.com if i specify http_proxy=<proxy ip>

And obviously, as part of the CI, MAAS is configured to use the external proxy as well, hence, we have validated that the proxy gives access to the outside world.

So it is either that we are not passing the config correctly, juju is not taking it, or if it is, then Juju is failing to access the internet via the proxy.

Thanks.

summary: - juju bootstrap fails with MAAS trunk and juju beta12 behind a proxy
+ juju bootstrap fails with MAAS trunk and juju (beta12 behind a proxy
+ too)
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Andres, Brendan

In the bug description, you said that you are seeing this behavior "since beta12".
But all the logs here refer to beta12 only...
So, just to confirm, you are seeing the same behavior with beta13?

Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

@Anastasia

We are testing with the version of Juju in the archive, is beta13 already available there? We can give it a try if you like

Changed in juju-core:
assignee: nobody → Dimiter Naydenov (dimitern)
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta14 → 2.0-beta15
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

With beta12, a tweaked maas 2.0 setup, I managed to reproduce the issue by bootstrapping and setting both http- and https-proxy.

MAAS config for bind9 and squid3 was modified as follows:
 - bind9 resolves streams.canonical.com to MAAS IP (10.14.0.1)
 - /etc/hosts lists the real IP of streams.c.c
 - squid3 uses the same dns_nameservers 10.14.0.1 (as the nodes)

Tests run on the MAAS server:
wget http://streams.canonical.com/juju/public-clouds.syaml # works
wget https://streams.canonical.com/juju/public-clouds.syaml # works
http_proxy=10.14.0.1:8000 wget http://streams.canonical.com/juju/public-clouds.syaml # works
https_proxy=10.14.0.1:8000 wget https://streams.canonical.com/juju/public-clouds.syaml # works
http_proxy=10.14.0.1:3128 wget http://streams.canonical.com/juju/public-clouds.syaml # FAILS
https_proxy=10.14.0.1:3128 wget https://streams.canonical.com/juju/public-clouds.syaml # FAILS

I couldn't get it to work in transparent mode (port 3128).

Then I deployed a node with xenial (without Juju) and run the same tests there.
As expected:
 - first two failed with 404 (as they were hitting MAAS)
 - the last two failed like on the MAAS server
 - the middle two worked (using the MAAS proxy on 8000).

Then I released that node and bootstrapped with Juju beta12 (from the release tag) and saw the same error eventually.

I'm retrying now with the tip of master, but so far it seems it will be the same case.
The root cause seems to be not setting http_proxy etc. BEFORE trying to figure out if an upgrade is needed (hitting streams.c.c), because the proxy updater worker cannot start (blocked on the same "upgrade in progress" check). Catch 22 classic.

Changed in juju-core:
status: Triaged → In Progress
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

OK, I see the problem now - I didn't manage to reproduce the issue WITHOUT --upload-tools passed to bootstrap and using 2.0-beta12 binaries.

NOTE: You *really* shouldn't be using `juju bootstrap --upload-tools` unless the juju binary (and jujud with it) where built from source.

There are a couple of issues here. As be observed in the machine-0.log:

2016-08-05 15:56:23 INFO juju.cmd supercommand.go:63 running jujud [2.0-beta12.1 gc go1.6.2]
...
2016-08-05 15:56:24 INFO juju.worker.upgradesteps worker.go:72 no upgrade steps required or upgrade steps for 2.0-beta12.1 have already been run.
...
2016-08-05 15:56:27 INFO juju.worker.upgrader upgrader.go:178 desired tool version: 2.0-beta12
2016-08-05 15:56:27 INFO juju.worker.upgrader upgrader.go:199 upgrade requested from 2.0-beta12.1 to 2.0-beta12

beta12.1 -> beta12 is what I'd call "downgrade", not upgrade and that's the root cause.

However, without --upload-tools it works just fine. It also works with built from source master tip (beta15) AND --upload-tools as the version is more recent than the last release.

Using proxy or not won't make a difference, because --upload-tools causes the version juju thinks it's using is ALWAYS more recent than the one in the streams.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

After further investigation, I managed to successfully bootstrap with a beta14 client and --upload-tools. The behaviour described above is as expected for the upgrader worker, but the discrepancy between the agent config's recorded version (2.0-beta12) and the uploaded tools version (2.0-beta12.1) turns out to be already fixed with bug 1605050, of which this bug is a duplicate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.