Canonical Juju

Juju2: 'Creating container: failed to ensure LXD image: image not imported!'

Series 2.1
Bug #1650304

Bug #1650304 reported by Larry Michel on 2016-12-15

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Canonical Juju	Expired	Critical	Unassigned
	2.1	Expired	Low	Unassigned

Bug Description

We're seeing on our staging environment:

  "1":
    juju-status:
      current: started
      since: 15 Dec 2016 11:35:42Z
      version: 2.1-beta2
    dns-name: 10.245.16.191
    ip-addresses:
    - 10.245.16.191
    instance-id: 4y3khr
    machine-status:
      current: running
      message: Deployed
      since: 15 Dec 2016 11:31:46Z
    series: xenial
    containers:
      1/lxd/0:
        juju-status:
          current: down
          message: agent is not communicating with the server
          since: 15 Dec 2016 11:37:12Z
        instance-id: pending
        machine-status:
          current: provisioning error
          message: 'Creating container: failed to ensure LXD image: image not imported!'
          since: 15 Dec 2016 11:37:12Z
        series: xenial
      1/lxd/1:
        juju-status:
          current: down
          message: agent is not communicating with the server
          since: 15 Dec 2016 11:37:59Z
        instance-id: pending
        machine-status:
          current: provisioning error
          message: 'Creating container: failed to ensure LXD image: image not imported!'
          since: 15 Dec 2016 11:37:58Z
        series: xenial
    hardware: arch=amd64 cores=8 mem=32768M tags=hardware-hp-proliant-DL320E,anahuac,hw-alai-staging,hw-staging-xenial
      availability-zone=default

But, it's always happening on the same server though and what's particular about this server is that it is switched to PXE boot from eth1 rather than eth0 which has the primary IP and was the PXE NIC during commissioning. The server deployed OK in MAAS though and it seems like it's reachable until LXD networking is configured.

Also seen in CI:
http://reports.vapour.ws/releases/issue/58929258749a5607ec1b7aa4

See original description

Tags:

Revision history for this message

Larry Michel (lmic) wrote on 2016-12-15:

machine_1-nova-compute_0-plumgrid-edge_0-swift-storage_0-10.245.16.191.tar Edit (10.0 KiB, application/x-tar)

Revision history for this message

Larry Michel (lmic) wrote on 2016-12-15:

Screenshot from 2016-12-15 09-14-32.png Edit (44.2 KiB, image/png)

This is what the NIC config looks like in maas (attached screenshot)

Anastasia (anastasia-macmood) on 2017-01-11

Changed in juju:
status:	New → Triaged
importance:	Undecided → High
milestone:	none → 2.1.0

Revision history for this message

John A Meinel (jameinel) wrote on 2017-02-02:

Maybe I did something wrong but that .tar file is only 10kB in size and doesn't appear to contain any data. (there is an odd string which is 'Select a file with cursor and press ENTER'. in the tar file)

Having more details like DEBUG level logs for the host machine and possibly for the controller machine might be helpful. There may be some reason we are failing to contact cloud-images.ubuntu.com to get an image for the container.

Anastasia (anastasia-macmood) on 2017-02-03

Changed in juju:
status:	Triaged → Incomplete
milestone:	2.1.0 → none

Aaron Bentley (abentley) on 2017-02-06

description:	updated
Changed in juju:
status:	Incomplete → Triaged

Aaron Bentley (abentley) on 2017-02-09

Changed in juju:
importance:	High → Critical
tags:	added: regression

Anastasia (anastasia-macmood) on 2017-02-09

Changed in juju:
milestone:	none → 2.1-rc1

Anastasia (anastasia-macmood) on 2017-02-09

Changed in juju:
milestone:	2.1-rc1 → 2.1.0

Ian Booth (wallyworld) on 2017-02-09

Changed in juju:
milestone:	2.1.0 → 2.2.0-alpha1

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

This is happening on RackSpace in CI. There appears to be a problem with bridging still.

The machines each have two NICs: one should have a public (floating) IP address, and the other should have a private 10.x.y.z address. On the controller machine (which is fine, since it has no containers on it), this is the case.

I created a machine with a LXD container, and the host ends up with both NICs bridged: br-eth0 ends up with IPv6 addresses (only), and br-eth1 ends up with the 10-dot address. So the machine agent can still talk to the controller, but it cannot route to the Internet. That's why the container fails to start, because the cloud-images repository cannot be reached.

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

Machine agent log for RackSpace host with broken network Edit (42.4 KiB, text/plain)

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

It appears to be related to the fact that there's both IPv4 and IPv6 available on the machine. /etc/network/interfaces contains stanzas for both inet and inet6 for br-eth0. If I comment out all of the inet6 stanzas, the bridge gets an IPv4 address.

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

I'm no expert on debian bridging, but empirically it appears that we should not be specifying bridge_ports in both the inet and inet6 br- stanzas. Just specify in one (inet, say), and not in the other. At least, that worked for me.

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

The add-juju-bridge script had code for handling updating existing bridges, but failed to cater for adding multiple iface stanzas for a bridge at once. I'm testing the fix on RackSpace now.

I guess this is not the same issue affecting the initially reported MAAS deployment, since there's no IPv6 there AFAICT.

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

Larry, can you please provide the original /etc/network/interfaces file from the MAAS machine? i.e. before Juju touches it. Simplest way to do that would (I think) be to start Ubuntu on the machine without Juju. Alternatively have Juju deploy to it, but don't put any containers on it; then the bridging script won't run and modify the file.

Changed in juju:
status:	Triaged → Incomplete

Revision history for this message

Andrew Wilkins (axwalk) wrote on 2017-02-10:

#10

This PR makes Rackspace happy: https://github.com/juju/juju/pull/6962.

I'll need to see the /etc/network/interfaces file from MAAS to be sure, but at a guess the eth0 interface may have stanzas for both inet and bootp. If that's the case, the same PR may fix it.

Revision history for this message

Larry Michel (lmic) wrote on 2017-02-10:

#11

Andrew, that machine is no longer in that state. But I think it's reproducible by having system boot from MAAS non-designated PXE NIC eth1 as shown in . In that case, eth0 (NIC that's set to auto-assign) should still get a static IP but perhaps something changes in that case. I'll see what I can do to get another system to PXE boot from eth1 and collect interfaces files for both cases.

Anastasia (anastasia-macmood) on 2017-02-14

Changed in juju:
milestone:	2.2.0-alpha1 → none

Revision history for this message

Larry Michel (lmic) wrote on 2017-04-06:

#12

I am trying to recreate by doing a simple rename through maas. So, the scenario would be to simply have eth1 as the boot device that's set to auto-assign and leave eth0 as unconfigured. Then, I could deploy ubuntu to 0, and mysql to lxd:0. I am working on this test and will update with result.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2017-06-06:

#13

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status:	Incomplete → Expired

Revision history for this message

Launchpad Janitor (janitor) wrote on 2017-06-06:

#14

[Expired for juju 2.1 because there has been no activity for 60 days.]

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #1661130

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.