lxd cannot bootstrap with image streams

Bug #1519027 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Eric Snow

Bug Description

As seen in
    http://reports.vapour.ws/releases/issue/56533d21749a566e7e686ff2

2015-11-23 16:06:45 ERROR cmd supercommand.go:448 failed to bootstrap environment: waited for 10m0s without being able to connect: Warning: Permanently added '10.0.3.178' (ECDSA) to the list of known hosts.
/var/lib/juju/nonce.txt does not exist

Juju can bootstrap with --upload-tools, but not by running
    juju bootstrap
where we expect the juju to use agents from streams.

This is wily. The juju is built with go 1.5 and agents are made from the same deb.

Changed in juju-core:
milestone: 1.26-alpha2 → 1.26-beta1
Changed in juju-core:
status: Triaged → In Progress
assignee: nobody → Eric Snow (ericsnowcurrently)
Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

I have not reproduced this timeout locally. Bootstrap *does* fail when using the stream though. It happens on the new controller instance when "jujud bootstrap-state" is run (and it can't find the "lxd" provider").

Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

The problem is that we use a single image ("ubuntu") for all series. The provider doesn't verify that the image matches the series the client requested for a new instance. Instead you always get the series of the original image. In this case, that is a trusty image, which does not support the LXD provider.

We will need to provide a way to support multiple "ubuntu" images, one for each desired series.

Curtis Hovey (sinzui)
summary: - lxd cannot bootstrap with streams
+ lxd cannot bootstrap with image streams
Revision history for this message
Curtis Hovey (sinzui) wrote :

As seen in this test, we can force a pass
    http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/lxd-deploy-wily-amd64/51/console

I deleted the "ubuntu" alias that pointed to the trusty image. I imported the wily cloud-image
    lxd-images import ubuntu wily --alias ubuntu

The test with --upload tools worked when the default series was trusty because --upload-tools will manufacture fake agents. The trusty lxd image was started, and the wily/go1.5 agents with lxd-support claimed to be trusty.

When we realised that trusty clients and agents are not compiled with lxd and wanted bootstrap with --upload-tools, we changed the default series to wily. As Eric states on comment 2, there is only one "ubuntu" image registered. Juju is not download images, so the wrong image was started.

Since the right image is now registered, CI will pass. We can update the release notes to with the corrected image registration. As a half step, we could explain the two wily and trusty cases.

Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

http://reviews.vapour.ws/r/3222/

That patch addresses the multiple images issue. However, it looks like there may also be a problem with the published wily image. [1] It appears that the image does not have cloud-init. That means that Juju's cloud-init script never gets run, leading to the timeout that Curtis originally reported. The trusty image (pulled the same way) does not have this problem.

[1] The image comes from running: lxd-images import ubuntu --alias ubuntu-wily wily

Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

It turns out that the wily image was fine. Systemd (which wily uses) won't work in LXC containers if the host isn't running lxcfs. My local machine had a bunch of old packages and did not have lxcfs installed. After updating and installing, I was able to bootstrap the LXD provider with a wily instance (with my patch applied).

Changed in juju-core:
status: In Progress → Fix Committed
Changed in juju-core:
milestone: 1.26-beta1 → 2.0-alpha1
Changed in juju-core:
milestone: 2.0-alpha1 → 1.26-alpha3
Curtis Hovey (sinzui)
Changed in juju-core:
importance: Critical → High
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
tags: added: 2.0-count
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.